Re: [Pulp-dev] Composed Repositories

2018-05-16 Thread Jeff Ortel



On 05/15/2018 11:59 AM, Brian Bouterse wrote:
I agree these are specific cases for a few content types that are used 
by multiple plugins. I think the most productive thing would be for us 
to talk in specific only about kickstart trees being shared between 
RPM and ostree. It would be much easier to generalize after building 
something specific once (I think).


This discussion wasn't about generalization or abstraction.  It's about 
dealing with remote repositories that are different combinations of 
common content types.  That said, while searching for concrete examples 
(use cases), it turns out these combinations don't really exist.  In 
pulp2, the RPM plugin is used to sync ISO repositories but they are not 
combined with other content types in the same repository.  Kickstart 
trees are only combined with YUM repositories.  Combination 
OSTree/KS-tree repositories aren't really a thing.


I think this thread can end here.



A mentor I had once told all software that lives long enough goes 
through 3 phases. (1) A concrete implementation (2) generalizing that 
implementation, and then (3) rewriting that implementation because of 
everything you didn't know before. I'm advocating for us to think 
about the problem as a specific plugin problem (step 1) and then after 
that is done, to look at generalizing it (step 2).


On Tue, May 15, 2018 at 11:27 AM, Bryan Kearney > wrote:


On 05/14/2018 03:44 PM, Jeff Ortel wrote:
> Let's brainstorm on something.
>
> Pulp needs to deal with remote repositories that are composed of
> multiple content types which may span the domain of a single
plugin.
> Here are a few examples.  Some Red Hat RPM repositories are
composed of:
> RPMs, DRPMs, , ISOs and Kickstart Trees.  Some OSTree
repositories are
> composed of OSTrees & Kickstart Trees. This raises a question:
>
> How can pulp3 best support syncing with remote repositories that are
> composed of multiple (unrelated) content types in a way that doesn't
> result in plugins duplicating support for content types?
>


Both these examples are cases of RPM repos, yes? If so, does this
require a general purpose solution?

-- bk



___
Pulp-dev mailing list
Pulp-dev@redhat.com 
https://www.redhat.com/mailman/listinfo/pulp-dev





___
Pulp-dev mailing list
Pulp-dev@redhat.com
https://www.redhat.com/mailman/listinfo/pulp-dev


Re: [Pulp-dev] Composed Repositories

2018-05-15 Thread Milan Kovacik
On Tue, May 15, 2018 at 4:48 PM, Jeff Ortel  wrote:
>
>
> On 05/15/2018 09:29 AM, Milan Kovacik wrote:
>>
>> Hi,
>>
>> On Tue, May 15, 2018 at 3:22 PM, Dennis Kliban  wrote:
>>>
>>> On Mon, May 14, 2018 at 3:44 PM, Jeff Ortel  wrote:

 Let's brainstorm on something.

 Pulp needs to deal with remote repositories that are composed of
 multiple
 content types which may span the domain of a single plugin.  Here are a
 few
 examples.  Some Red Hat RPM repositories are composed of: RPMs, DRPMs, ,
 ISOs and Kickstart Trees.  Some OSTree repositories are composed of
 OSTrees
 & Kickstart Trees. This raises a question:

 How can pulp3 best support syncing with remote repositories that are
 composed of multiple (unrelated) content types in a way that doesn't
 result
 in plugins duplicating support for content types?

 Few approaches come to mind:

 1. Multiple plugins (Remotes) participate in the sync flow to produce a
 new repository version.
 2. Multiple plugins (Remotes) are sync'd successively each producing a
 new
 version of a repository.  Only the last version contains the fully
 sync'd
 composition.
 3. Plugins share code.
 4. Other?


 Option #1: Sync would be orchestrated by core or the user so that
 multiple
 plugins (Remotes) participate in populating a new repository version.
 For
 example: the RPM plugin (Remote) and the Kickstart Tree plugin (Remote)
 would both be sync'd against the same remote repository that is composed
 of
 both types.  The new repository version would be composed of the result
 of
 both plugin (Remote) syncs.  To support this, we'd need to provide a way
 for
 each plugin to operate seamlessly on the same (new) repository version.
 Perhaps something internal to the RepositoryVersion.  The repository
 version
 would not be marked "complete" until the last plugin (Remote) sync has
 succeeded.  More complicated than #2 but results in only creating truly
 complete versions or nothing.  No idea how this would work with current
 REST
 API whereby plugins provide sync endpoints.

>>> I like this approach because it allows the user to perform a single call
>>> to
>>> the REST API and specify multiple "sync methods" to use to create a
>>> single
>>> new repository version.
>>
>> Same here, esp. if the goal is an all-or-nothing behavior w/r the
>> mix-in remotes; i.e an atomic sync.
>> This has a benefit of a clear start and end of the sync procedure,
>> that the user might want to refer to.
>>
 Option #2: Sync would be orchestrated by core or the user so that
 multiple
 plugins (Remotes) create successive repository versions.  For example:
 the
 RPM plugin (Remote) and the Kickstart Tree plugin (Remote) would both be
 sync'd against the same remote repository that is a composition
 including
 both types.  The intermediate versions would be incomplete.  Only the
 last
 version contains the fully sync'd composition.  This approach can be
 supported by core today :) but will produce incomplete repository
 versions
 that are marked complete=True.  This /seems/ undesirable, right?  This
 may
 not be a problem for distribution since I would imaging that only the
 last
 (fully composed) version would be published.  But what about other
 usages of
 the repository's "latest" version?
>>
>> I'm afraid I don't see use of a middle-version esp. in case of
>> failures; e.g ostree failed to sync while rpm managed and kickstart
>> managed too; is the sync OK as a whole? What to do with the versions
>> created? Should I merge the successes into one and retry the failure?
>> How many versions would this introduce?
>
>
> (option 2) The partial versions would be created in both normal and failure
> scenarios.  The normal scenario is created because each plugin (Remote)
> creates a new version and only the last one is completed.  the intermediate
> versions are always partial.

right but is there a legitimate use of the intermediate version?
if not, maybe Option #1 is better (atomic)

>
>>
 Option #3: requires a plugin to be aware of specific repository
 composition(s); other plugins and creates a code dependency between
 plugins.
 For example, the RPM plugin could delegate ISOs to the File plugin and
 Kickstart Trees to the KickStart Tree plugin.
>>
>> Do you mean that the RPM plug-in would directly call into the File
>> plug-in?
>> If that's the case then I don't like it much, would be a pain every
>> time a new plug-in would be introduced (O(len(plugin)^2) of updates)
>> or if the API of a plug-in changed (O(len(plugin)) updates).
>> Esp. keeping the plugin code aware of other plugin updates would be ugly.
>
>
> Agreed.  The plugins could install libs into 

Re: [Pulp-dev] Composed Repositories

2018-05-15 Thread Brian Bouterse
I agree these are specific cases for a few content types that are used by
multiple plugins. I think the most productive thing would be for us to talk
in specific only about kickstart trees being shared between RPM and ostree.
It would be much easier to generalize after building something specific
once (I think).

A mentor I had once told all software that lives long enough goes through 3
phases. (1) A concrete implementation (2) generalizing that implementation,
and then (3) rewriting that implementation because of everything you didn't
know before. I'm advocating for us to think about the problem as a specific
plugin problem (step 1) and then after that is done, to look at
generalizing it (step 2).

On Tue, May 15, 2018 at 11:27 AM, Bryan Kearney  wrote:

> On 05/14/2018 03:44 PM, Jeff Ortel wrote:
> > Let's brainstorm on something.
> >
> > Pulp needs to deal with remote repositories that are composed of
> > multiple content types which may span the domain of a single plugin.
> > Here are a few examples.  Some Red Hat RPM repositories are composed of:
> > RPMs, DRPMs, , ISOs and Kickstart Trees.  Some OSTree repositories are
> > composed of OSTrees & Kickstart Trees. This raises a question:
> >
> > How can pulp3 best support syncing with remote repositories that are
> > composed of multiple (unrelated) content types in a way that doesn't
> > result in plugins duplicating support for content types?
> >
>
>
> Both these examples are cases of RPM repos, yes? If so, does this
> require a general purpose solution?
>
> -- bk
>
>
>
> ___
> Pulp-dev mailing list
> Pulp-dev@redhat.com
> https://www.redhat.com/mailman/listinfo/pulp-dev
>
>
___
Pulp-dev mailing list
Pulp-dev@redhat.com
https://www.redhat.com/mailman/listinfo/pulp-dev


Re: [Pulp-dev] Composed Repositories

2018-05-15 Thread Jeff Ortel



On 05/15/2018 10:41 AM, Jeff Ortel wrote:



On 05/15/2018 10:27 AM, Bryan Kearney wrote:

On 05/14/2018 03:44 PM, Jeff Ortel wrote:

Let's brainstorm on something.

Pulp needs to deal with remote repositories that are composed of
multiple content types which may span the domain of a single plugin.
Here are a few examples.  Some Red Hat RPM repositories are composed 
of:

RPMs, DRPMs, , ISOs and Kickstart Trees.  Some OSTree repositories are
composed of OSTrees & Kickstart Trees. This raises a question:

How can pulp3 best support syncing with remote repositories that are
composed of multiple (unrelated) content types in a way that doesn't
result in plugins duplicating support for content types?



Both these examples are cases of RPM repos, yes? If so, does this
require a general purpose solution?


The example in the thread is mainly RPM but there are other 
repositories with shared content types.  Eg: OSTree repositories also 
containing Kickstart Trees.


I also think there is value in not having the RPM plugin be a /mega/ 
plugin that knows how to deal with several complicated types of content 
(like in pulp2).  Making each plugin responsible for specific closely 
related types of content would make them more maintainable.






-- bk






___
Pulp-dev mailing list
Pulp-dev@redhat.com
https://www.redhat.com/mailman/listinfo/pulp-dev


Re: [Pulp-dev] Composed Repositories

2018-05-15 Thread Bryan Kearney
On 05/14/2018 03:44 PM, Jeff Ortel wrote:
> Let's brainstorm on something.
> 
> Pulp needs to deal with remote repositories that are composed of
> multiple content types which may span the domain of a single plugin. 
> Here are a few examples.  Some Red Hat RPM repositories are composed of:
> RPMs, DRPMs, , ISOs and Kickstart Trees.  Some OSTree repositories are
> composed of OSTrees & Kickstart Trees. This raises a question: 
> 
> How can pulp3 best support syncing with remote repositories that are
> composed of multiple (unrelated) content types in a way that doesn't
> result in plugins duplicating support for content types?
> 


Both these examples are cases of RPM repos, yes? If so, does this
require a general purpose solution?

-- bk




signature.asc
Description: OpenPGP digital signature
___
Pulp-dev mailing list
Pulp-dev@redhat.com
https://www.redhat.com/mailman/listinfo/pulp-dev


Re: [Pulp-dev] Composed Repositories

2018-05-15 Thread Jeff Ortel



On 05/15/2018 09:29 AM, Milan Kovacik wrote:

Hi,

On Tue, May 15, 2018 at 3:22 PM, Dennis Kliban  wrote:

On Mon, May 14, 2018 at 3:44 PM, Jeff Ortel  wrote:

Let's brainstorm on something.

Pulp needs to deal with remote repositories that are composed of multiple
content types which may span the domain of a single plugin.  Here are a few
examples.  Some Red Hat RPM repositories are composed of: RPMs, DRPMs, ,
ISOs and Kickstart Trees.  Some OSTree repositories are composed of OSTrees
& Kickstart Trees. This raises a question:

How can pulp3 best support syncing with remote repositories that are
composed of multiple (unrelated) content types in a way that doesn't result
in plugins duplicating support for content types?

Few approaches come to mind:

1. Multiple plugins (Remotes) participate in the sync flow to produce a
new repository version.
2. Multiple plugins (Remotes) are sync'd successively each producing a new
version of a repository.  Only the last version contains the fully sync'd
composition.
3. Plugins share code.
4. Other?


Option #1: Sync would be orchestrated by core or the user so that multiple
plugins (Remotes) participate in populating a new repository version.  For
example: the RPM plugin (Remote) and the Kickstart Tree plugin (Remote)
would both be sync'd against the same remote repository that is composed of
both types.  The new repository version would be composed of the result of
both plugin (Remote) syncs.  To support this, we'd need to provide a way for
each plugin to operate seamlessly on the same (new) repository version.
Perhaps something internal to the RepositoryVersion.  The repository version
would not be marked "complete" until the last plugin (Remote) sync has
succeeded.  More complicated than #2 but results in only creating truly
complete versions or nothing.  No idea how this would work with current REST
API whereby plugins provide sync endpoints.


I like this approach because it allows the user to perform a single call to
the REST API and specify multiple "sync methods" to use to create a single
new repository version.

Same here, esp. if the goal is an all-or-nothing behavior w/r the
mix-in remotes; i.e an atomic sync.
This has a benefit of a clear start and end of the sync procedure,
that the user might want to refer to.


Option #2: Sync would be orchestrated by core or the user so that multiple
plugins (Remotes) create successive repository versions.  For example: the
RPM plugin (Remote) and the Kickstart Tree plugin (Remote) would both be
sync'd against the same remote repository that is a composition including
both types.  The intermediate versions would be incomplete.  Only the last
version contains the fully sync'd composition.  This approach can be
supported by core today :) but will produce incomplete repository versions
that are marked complete=True.  This /seems/ undesirable, right?  This may
not be a problem for distribution since I would imaging that only the last
(fully composed) version would be published.  But what about other usages of
the repository's "latest" version?

I'm afraid I don't see use of a middle-version esp. in case of
failures; e.g ostree failed to sync while rpm managed and kickstart
managed too; is the sync OK as a whole? What to do with the versions
created? Should I merge the successes into one and retry the failure?
How many versions would this introduce?


(option 2) The partial versions would be created in both normal and 
failure scenarios.  The normal scenario is created because each plugin 
(Remote) creates a new version and only the last one is completed.  the 
intermediate versions are always partial.





Option #3: requires a plugin to be aware of specific repository
composition(s); other plugins and creates a code dependency between plugins.
For example, the RPM plugin could delegate ISOs to the File plugin and
Kickstart Trees to the KickStart Tree plugin.

Do you mean that the RPM plug-in would directly call into the File plug-in?
If that's the case then I don't like it much, would be a pain every
time a new plug-in would be introduced (O(len(plugin)^2) of updates)
or if the API of a plug-in changed (O(len(plugin)) updates).
Esp. keeping the plugin code aware of other plugin updates would be ugly.


Agreed.  The plugins could install libs into site-packages which would 
at least mitigate the complexity of calling into each other through the 
pulp plugin framework but I don't think it helps much. Even the rpm 
dependency is undesirable.





For all options, plugins (Remotes) need to limit sync to affect only those
content types within their domain.  For example, the RPM (Remote) sync
cannot add/remove ISO or KS Trees.

I am an advocate of some from of options #1 or #2.  Combining plugins
(Remotes) as needed to deal with arbitrary combinations within remote
repositories seems very powerful; does not impose complexity on plugin
writers; and does not introduce code dependencies 

Re: [Pulp-dev] Composed Repositories

2018-05-15 Thread Jeff Ortel



On 05/15/2018 05:58 AM, Austin Macdonald wrote:

Here's another complexity, how do 2 plugins create a single publication?


The plugin API could make this seamless.

We basically have the same problem of 2 parallel operations creating 
content from a single source.


I don't think so.  plugins should not manipulate content outside of 
their domain (other plugins content) so either serial or parallel should 
be safe.




On Tue, May 15, 2018, 06:27 Ina Panova > wrote:


+1 on not introducing dependencies between plugins.

What will be the behavior in case there is a composed repo of rpm
and ks trees but just the rpm plugin is installed?

Do we fail and say we cannot sync this repo at all or we just sync
the rpm part?


Assuming plugins do not depend on each other, I think that when each 
plugin looks at the upstream repo, they will only "see" the content of 
that type. Conceptually, we will have 2 remotes, so it will feel like 
we are syncing from 2 totally distinct repositories.


The solution I've been imagining is a lot like 2. Each plugin would 
sync to a *separate repository.* These separate repositories are then 
published creating *separate publications*. This approach allows the 
plugins to live completely in ignorance of each other.


The final step is to associate *both publications to one 
distribution*, which composes the publications as they are served.


The downside is that we have to sync and publish twice, and that the 
resulting versions and publications aren't locked together. But I 
think this is better than leaving versions and publications unfinished 
with the assumption that another plugin will finish the job. Maybe 
linking them together could be a good use of the notes field.


Pulp should support repositories with composed (mixed) content for the 
same reason RH does.  The repository is a collection of content that 
users want to manage together.  Consider the promotion cases: dev, test, 
prod.





Depends how we plan this ^ i guess we'll decide which option 1 or
2 fits better.

Don't want to go wild, but what if notion of composed repos will
be so popular in the future that's its amount will increase? I
think we do want to at least partially being able to sync it and
not take the approach all or nothing?

#2 speaks to me more for now.





Regards,

Ina Panova
Software Engineer| Pulp| Red Hat Inc.

"Do not go where the path may lead,
 go instead where there is no path and leave a trail."

On Mon, May 14, 2018 at 9:44 PM, Jeff Ortel > wrote:

Let's brainstorm on something.

Pulp needs to deal with remote repositories that are composed
of multiple content types which may span the domain of a
single plugin.  Here are a few examples.  Some Red Hat RPM
repositories are composed of: RPMs, DRPMs, , ISOs and
Kickstart Trees.  Some OSTree repositories are composed of
OSTrees & Kickstart Trees. This raises a question:

How can pulp3 best support syncing with remote repositories
that are composed of multiple (unrelated) content types in a
way that doesn't result in plugins duplicating support for
content types?

Few approaches come to mind:

1. Multiple plugins (Remotes) participate in the sync flow to
produce a new repository version.
2. Multiple plugins (Remotes) are sync'd successively each
producing a new version of a repository.  Only the last
version contains the fully sync'd composition.
3. Plugins share code.
4. Other?


Option #1: Sync would be orchestrated by core or the user so
that multiple plugins (Remotes) participate in populating a
new repository version.  For example: the RPM plugin (Remote)
and the Kickstart Tree plugin (Remote) would both be sync'd
against the same remote repository that is composed of both
types.  The new repository version would be composed of the
result of both plugin (Remote) syncs.  To support this, we'd
need to provide a way for each plugin to operate seamlessly on
the same (new) repository version.  Perhaps something internal
to the RepositoryVersion.  The repository version would not be
marked "complete" until the last plugin (Remote) sync has
succeeded.  More complicated than #2 but results in only
creating truly complete versions or nothing.  No idea how this
would work with current REST API whereby plugins provide sync
endpoints.

Option #2: Sync would be orchestrated by core or the user so
that multiple plugins (Remotes) create successive repository
versions.  For example: the RPM plugin (Remote) and the
Kickstart Tree plugin (Remote) would both be sync'd against
 

Re: [Pulp-dev] Composed Repositories

2018-05-15 Thread Jeff Ortel



On 05/15/2018 05:26 AM, Ina Panova wrote:

+1 on not introducing dependencies between plugins.

What will be the behavior in case there is a composed repo of rpm and 
ks trees but just the rpm plugin is installed?


I would expect the result would be to only sync the rpm content into the 
pulp repository.


Do we fail and say we cannot sync this repo at all or we just sync the 
rpm part?


No, I think it would be expected to succeed since the user has only 
installed the rpm plugin and requested that only rpm content be sync'd.  
The remote repository is composed of multiple content types out of 
convenience for managing the content.  Pulp should not be bound to the 
organization of remote repositories.




Depends how we plan this ^ i guess we'll decide which option 1 or 2 
fits better.


Don't want to go wild, but what if notion of composed repos will be so 
popular in the future that's its amount will increase? I think we do 
want to at least partially being able to sync it and not take the 
approach all or nothing?


#2 speaks to me more for now.


#2 will create repository version with partial content which are 
complete=True.  Given users can choose which version to publish, do you 
see this as a problem.  What about cases where the "latest" version is, 
at times, partial?








Regards,

Ina Panova
Software Engineer| Pulp| Red Hat Inc.

"Do not go where the path may lead,
 go instead where there is no path and leave a trail."

On Mon, May 14, 2018 at 9:44 PM, Jeff Ortel > wrote:


Let's brainstorm on something.

Pulp needs to deal with remote repositories that are composed of
multiple content types which may span the domain of a single
plugin.  Here are a few examples. Some Red Hat RPM repositories
are composed of: RPMs, DRPMs, , ISOs and Kickstart Trees.  Some
OSTree repositories are composed of OSTrees & Kickstart Trees.
This raises a question:

How can pulp3 best support syncing with remote repositories that
are composed of multiple (unrelated) content types in a way that
doesn't result in plugins duplicating support for content types?

Few approaches come to mind:

1. Multiple plugins (Remotes) participate in the sync flow to
produce a new repository version.
2. Multiple plugins (Remotes) are sync'd successively each
producing a new version of a repository.  Only the last version
contains the fully sync'd composition.
3. Plugins share code.
4. Other?


Option #1: Sync would be orchestrated by core or the user so that
multiple plugins (Remotes) participate in populating a new
repository version.  For example: the RPM plugin (Remote) and the
Kickstart Tree plugin (Remote) would both be sync'd against the
same remote repository that is composed of both types.  The new
repository version would be composed of the result of both plugin
(Remote) syncs.  To support this, we'd need to provide a way for
each plugin to operate seamlessly on the same (new) repository
version.  Perhaps something internal to the RepositoryVersion. 
The repository version would not be marked "complete" until the
last plugin (Remote) sync has succeeded.  More complicated than #2
but results in only creating truly complete versions or nothing. 
No idea how this would work with current REST API whereby plugins
provide sync endpoints.

Option #2: Sync would be orchestrated by core or the user so that
multiple plugins (Remotes) create successive repository versions. 
For example: the RPM plugin (Remote) and the Kickstart Tree plugin
(Remote) would both be sync'd against the same remote repository
that is a composition including both types.  The intermediate
versions would be incomplete. Only the last version contains the
fully sync'd composition.  This approach can be supported by core
today :) but will produce incomplete repository versions that are
marked complete=True.  This /seems/ undesirable, right? This may
not be a problem for distribution since I would imaging that only
the last (fully composed) version would be published.  But what
about other usages of the repository's "latest" version?

Option #3: requires a plugin to be aware of specific repository
composition(s); other plugins and creates a code dependency
between plugins.  For example, the RPM plugin could delegate ISOs
to the File plugin and Kickstart Trees to the KickStart Tree plugin.

For all options, plugins (Remotes) need to limit sync to affect
only those content types within their domain. For example, the RPM
(Remote) sync cannot add/remove ISO or KS Trees.

I am an advocate of some from of options #1 or #2. Combining
plugins (Remotes) as needed to deal with arbitrary combinations
within remote repositories seems very powerful; does not impose
complexity on plugin writers; and does 

Re: [Pulp-dev] Composed Repositories

2018-05-15 Thread Milan Kovacik
Hi,

On Tue, May 15, 2018 at 3:22 PM, Dennis Kliban  wrote:
> On Mon, May 14, 2018 at 3:44 PM, Jeff Ortel  wrote:
>>
>> Let's brainstorm on something.
>>
>> Pulp needs to deal with remote repositories that are composed of multiple
>> content types which may span the domain of a single plugin.  Here are a few
>> examples.  Some Red Hat RPM repositories are composed of: RPMs, DRPMs, ,
>> ISOs and Kickstart Trees.  Some OSTree repositories are composed of OSTrees
>> & Kickstart Trees. This raises a question:
>>
>> How can pulp3 best support syncing with remote repositories that are
>> composed of multiple (unrelated) content types in a way that doesn't result
>> in plugins duplicating support for content types?
>>
>> Few approaches come to mind:
>>
>> 1. Multiple plugins (Remotes) participate in the sync flow to produce a
>> new repository version.
>> 2. Multiple plugins (Remotes) are sync'd successively each producing a new
>> version of a repository.  Only the last version contains the fully sync'd
>> composition.
>> 3. Plugins share code.
>> 4. Other?
>>
>>
>> Option #1: Sync would be orchestrated by core or the user so that multiple
>> plugins (Remotes) participate in populating a new repository version.  For
>> example: the RPM plugin (Remote) and the Kickstart Tree plugin (Remote)
>> would both be sync'd against the same remote repository that is composed of
>> both types.  The new repository version would be composed of the result of
>> both plugin (Remote) syncs.  To support this, we'd need to provide a way for
>> each plugin to operate seamlessly on the same (new) repository version.
>> Perhaps something internal to the RepositoryVersion.  The repository version
>> would not be marked "complete" until the last plugin (Remote) sync has
>> succeeded.  More complicated than #2 but results in only creating truly
>> complete versions or nothing.  No idea how this would work with current REST
>> API whereby plugins provide sync endpoints.
>>
>
> I like this approach because it allows the user to perform a single call to
> the REST API and specify multiple "sync methods" to use to create a single
> new repository version.

Same here, esp. if the goal is an all-or-nothing behavior w/r the
mix-in remotes; i.e an atomic sync.
This has a benefit of a clear start and end of the sync procedure,
that the user might want to refer to.

>
>>
>> Option #2: Sync would be orchestrated by core or the user so that multiple
>> plugins (Remotes) create successive repository versions.  For example: the
>> RPM plugin (Remote) and the Kickstart Tree plugin (Remote) would both be
>> sync'd against the same remote repository that is a composition including
>> both types.  The intermediate versions would be incomplete.  Only the last
>> version contains the fully sync'd composition.  This approach can be
>> supported by core today :) but will produce incomplete repository versions
>> that are marked complete=True.  This /seems/ undesirable, right?  This may
>> not be a problem for distribution since I would imaging that only the last
>> (fully composed) version would be published.  But what about other usages of
>> the repository's "latest" version?

I'm afraid I don't see use of a middle-version esp. in case of
failures; e.g ostree failed to sync while rpm managed and kickstart
managed too; is the sync OK as a whole? What to do with the versions
created? Should I merge the successes into one and retry the failure?
How many versions would this introduce?

>>
>> Option #3: requires a plugin to be aware of specific repository
>> composition(s); other plugins and creates a code dependency between plugins.
>> For example, the RPM plugin could delegate ISOs to the File plugin and
>> Kickstart Trees to the KickStart Tree plugin.

Do you mean that the RPM plug-in would directly call into the File plug-in?
If that's the case then I don't like it much, would be a pain every
time a new plug-in would be introduced (O(len(plugin)^2) of updates)
or if the API of a plug-in changed (O(len(plugin)) updates).
Esp. keeping the plugin code aware of other plugin updates would be ugly.

>>
>> For all options, plugins (Remotes) need to limit sync to affect only those
>> content types within their domain.  For example, the RPM (Remote) sync
>> cannot add/remove ISO or KS Trees.
>>
>> I am an advocate of some from of options #1 or #2.  Combining plugins
>> (Remotes) as needed to deal with arbitrary combinations within remote
>> repositories seems very powerful; does not impose complexity on plugin
>> writers; and does not introduce code dependencies between plugins.
>>
>> Thoughts?
>>
>> ___
>> Pulp-dev mailing list
>> Pulp-dev@redhat.com
>> https://www.redhat.com/mailman/listinfo/pulp-dev
>>
>
>
> ___
> Pulp-dev mailing list
> Pulp-dev@redhat.com
> https://www.redhat.com/mailman/listinfo/pulp-dev
>

Cheers,
milan


Re: [Pulp-dev] Composed Repositories

2018-05-15 Thread Dennis Kliban
On Mon, May 14, 2018 at 3:44 PM, Jeff Ortel  wrote:

> Let's brainstorm on something.
>
> Pulp needs to deal with remote repositories that are composed of multiple
> content types which may span the domain of a single plugin.  Here are a few
> examples.  Some Red Hat RPM repositories are composed of: RPMs, DRPMs, ,
> ISOs and Kickstart Trees.  Some OSTree repositories are composed of OSTrees
> & Kickstart Trees. This raises a question:
>
> How can pulp3 best support syncing with remote repositories that are
> composed of multiple (unrelated) content types in a way that doesn't result
> in plugins duplicating support for content types?
>
> Few approaches come to mind:
>
> 1. Multiple plugins (Remotes) participate in the sync flow to produce a
> new repository version.
> 2. Multiple plugins (Remotes) are sync'd successively each producing a new
> version of a repository.  Only the last version contains the fully sync'd
> composition.
> 3. Plugins share code.
> 4. Other?
>
>
> Option #1: Sync would be orchestrated by core or the user so that multiple
> plugins (Remotes) participate in populating a new repository version.  For
> example: the RPM plugin (Remote) and the Kickstart Tree plugin (Remote)
> would both be sync'd against the same remote repository that is composed of
> both types.  The new repository version would be composed of the result of
> both plugin (Remote) syncs.  To support this, we'd need to provide a way
> for each plugin to operate seamlessly on the same (new) repository
> version.  Perhaps something internal to the RepositoryVersion.  The
> repository version would not be marked "complete" until the last plugin
> (Remote) sync has succeeded.  More complicated than #2 but results in only
> creating truly complete versions or nothing.  No idea how this would work
> with current REST API whereby plugins provide sync endpoints.
>
>
I like this approach because it allows the user to perform a single call to
the REST API and specify multiple "sync methods" to use to create a single
new repository version.


> Option #2: Sync would be orchestrated by core or the user so that multiple
> plugins (Remotes) create successive repository versions.  For example: the
> RPM plugin (Remote) and the Kickstart Tree plugin (Remote) would both be
> sync'd against the same remote repository that is a composition including
> both types.  The intermediate versions would be incomplete.  Only the
> last version contains the fully sync'd composition.  This approach can be
> supported by core today :) but will produce incomplete repository versions
> that are marked complete=True.  This /seems/ undesirable, right?  This may
> not be a problem for distribution since I would imaging that only the last
> (fully composed) version would be published.  But what about other usages
> of the repository's "latest" version?
>
> Option #3: requires a plugin to be aware of specific repository
> composition(s); other plugins and creates a code dependency between
> plugins.  For example, the RPM plugin could delegate ISOs to the File
> plugin and Kickstart Trees to the KickStart Tree plugin.
>
> For all options, plugins (Remotes) need to limit sync to affect only those
> content types within their domain.  For example, the RPM (Remote) sync
> cannot add/remove ISO or KS Trees.
>
> I am an advocate of some from of options #1 or #2.  Combining plugins
> (Remotes) as needed to deal with arbitrary combinations within remote
> repositories seems very powerful; does not impose complexity on plugin
> writers; and does not introduce code dependencies between plugins.
>
> Thoughts?
>
> ___
> Pulp-dev mailing list
> Pulp-dev@redhat.com
> https://www.redhat.com/mailman/listinfo/pulp-dev
>
>
___
Pulp-dev mailing list
Pulp-dev@redhat.com
https://www.redhat.com/mailman/listinfo/pulp-dev


Re: [Pulp-dev] Composed Repositories

2018-05-15 Thread Austin Macdonald
Here's another complexity, how do 2 plugins create a single publication? We
basically have the same problem of 2 parallel operations creating content
from a single source.

On Tue, May 15, 2018, 06:27 Ina Panova  wrote:

> +1 on not introducing dependencies between plugins.
>
> What will be the behavior in case there is a composed repo of rpm and ks
> trees but just the rpm plugin is installed?
>
Do we fail and say we cannot sync this repo at all or we just sync the rpm
> part?
>

Assuming plugins do not depend on each other, I think that when each plugin
looks at the upstream repo, they will only "see" the content of that type.
Conceptually, we will have 2 remotes, so it will feel like we are syncing
from 2 totally distinct repositories.

The solution I've been imagining is a lot like 2. Each plugin would sync to
a *separate repository.* These separate repositories are then published
creating *separate publications*. This approach allows the plugins to live
completely in ignorance of each other.

The final step is to associate *both publications to one distribution*,
which composes the publications as they are served.

The downside is that we have to sync and publish twice, and that the
resulting versions and publications aren't locked together. But I think
this is better than leaving versions and publications unfinished with the
assumption that another plugin will finish the job. Maybe linking them
together could be a good use of the notes field.


> Depends how we plan this ^ i guess we'll decide which option 1 or 2 fits
> better.
>
> Don't want to go wild, but what if notion of composed repos will be so
> popular in the future that's its amount will increase? I think we do want
> to at least partially being able to sync it and not take the approach all
> or nothing?
>
> #2 speaks to me more for now.
>
>
>
>
> 
> Regards,
>
> Ina Panova
> Software Engineer| Pulp| Red Hat Inc.
>
> "Do not go where the path may lead,
>  go instead where there is no path and leave a trail."
>
> On Mon, May 14, 2018 at 9:44 PM, Jeff Ortel  wrote:
>
>> Let's brainstorm on something.
>>
>> Pulp needs to deal with remote repositories that are composed of multiple
>> content types which may span the domain of a single plugin.  Here are a few
>> examples.  Some Red Hat RPM repositories are composed of: RPMs, DRPMs, ,
>> ISOs and Kickstart Trees.  Some OSTree repositories are composed of OSTrees
>> & Kickstart Trees. This raises a question:
>>
>> How can pulp3 best support syncing with remote repositories that are
>> composed of multiple (unrelated) content types in a way that doesn't result
>> in plugins duplicating support for content types?
>>
>> Few approaches come to mind:
>>
>> 1. Multiple plugins (Remotes) participate in the sync flow to produce a
>> new repository version.
>> 2. Multiple plugins (Remotes) are sync'd successively each producing a
>> new version of a repository.  Only the last version contains the fully
>> sync'd composition.
>> 3. Plugins share code.
>> 4. Other?
>>
>>
>> Option #1: Sync would be orchestrated by core or the user so that
>> multiple plugins (Remotes) participate in populating a new repository
>> version.  For example: the RPM plugin (Remote) and the Kickstart Tree
>> plugin (Remote) would both be sync'd against the same remote repository
>> that is composed of both types.  The new repository version would be
>> composed of the result of both plugin (Remote) syncs.  To support this,
>> we'd need to provide a way for each plugin to operate seamlessly on the
>> same (new) repository version.  Perhaps something internal to the
>> RepositoryVersion.  The repository version would not be marked "complete"
>> until the last plugin (Remote) sync has succeeded.  More complicated than
>> #2 but results in only creating truly complete versions or nothing.  No
>> idea how this would work with current REST API whereby plugins provide sync
>> endpoints.
>>
>> Option #2: Sync would be orchestrated by core or the user so that
>> multiple plugins (Remotes) create successive repository versions.  For
>> example: the RPM plugin (Remote) and the Kickstart Tree plugin (Remote)
>> would both be sync'd against the same remote repository that is a
>> composition including both types.  The intermediate versions would be
>> incomplete.  Only the last version contains the fully sync'd
>> composition.  This approach can be supported by core today :) but will
>> produce incomplete repository versions that are marked complete=True.  This
>> /seems/ undesirable, right?  This may not be a problem for distribution
>> since I would imaging that only the last (fully composed) version would be
>> published.  But what about other usages of the repository's "latest"
>> version?
>>
>> Option #3: requires a plugin to be aware of specific repository
>> composition(s); other plugins and creates a code dependency between
>> plugins.  For example, the RPM plugin could delegate ISOs to 

Re: [Pulp-dev] Composed Repositories

2018-05-15 Thread Ina Panova
+1 on not introducing dependencies between plugins.

What will be the behavior in case there is a composed repo of rpm and ks
trees but just the rpm plugin is installed?
Do we fail and say we cannot sync this repo at all or we just sync the rpm
part?

Depends how we plan this ^ i guess we'll decide which option 1 or 2 fits
better.

Don't want to go wild, but what if notion of composed repos will be so
popular in the future that's its amount will increase? I think we do want
to at least partially being able to sync it and not take the approach all
or nothing?

#2 speaks to me more for now.





Regards,

Ina Panova
Software Engineer| Pulp| Red Hat Inc.

"Do not go where the path may lead,
 go instead where there is no path and leave a trail."

On Mon, May 14, 2018 at 9:44 PM, Jeff Ortel  wrote:

> Let's brainstorm on something.
>
> Pulp needs to deal with remote repositories that are composed of multiple
> content types which may span the domain of a single plugin.  Here are a few
> examples.  Some Red Hat RPM repositories are composed of: RPMs, DRPMs, ,
> ISOs and Kickstart Trees.  Some OSTree repositories are composed of OSTrees
> & Kickstart Trees. This raises a question:
>
> How can pulp3 best support syncing with remote repositories that are
> composed of multiple (unrelated) content types in a way that doesn't result
> in plugins duplicating support for content types?
>
> Few approaches come to mind:
>
> 1. Multiple plugins (Remotes) participate in the sync flow to produce a
> new repository version.
> 2. Multiple plugins (Remotes) are sync'd successively each producing a new
> version of a repository.  Only the last version contains the fully sync'd
> composition.
> 3. Plugins share code.
> 4. Other?
>
>
> Option #1: Sync would be orchestrated by core or the user so that multiple
> plugins (Remotes) participate in populating a new repository version.  For
> example: the RPM plugin (Remote) and the Kickstart Tree plugin (Remote)
> would both be sync'd against the same remote repository that is composed of
> both types.  The new repository version would be composed of the result of
> both plugin (Remote) syncs.  To support this, we'd need to provide a way
> for each plugin to operate seamlessly on the same (new) repository
> version.  Perhaps something internal to the RepositoryVersion.  The
> repository version would not be marked "complete" until the last plugin
> (Remote) sync has succeeded.  More complicated than #2 but results in only
> creating truly complete versions or nothing.  No idea how this would work
> with current REST API whereby plugins provide sync endpoints.
>
> Option #2: Sync would be orchestrated by core or the user so that multiple
> plugins (Remotes) create successive repository versions.  For example: the
> RPM plugin (Remote) and the Kickstart Tree plugin (Remote) would both be
> sync'd against the same remote repository that is a composition including
> both types.  The intermediate versions would be incomplete.  Only the
> last version contains the fully sync'd composition.  This approach can be
> supported by core today :) but will produce incomplete repository versions
> that are marked complete=True.  This /seems/ undesirable, right?  This may
> not be a problem for distribution since I would imaging that only the last
> (fully composed) version would be published.  But what about other usages
> of the repository's "latest" version?
>
> Option #3: requires a plugin to be aware of specific repository
> composition(s); other plugins and creates a code dependency between
> plugins.  For example, the RPM plugin could delegate ISOs to the File
> plugin and Kickstart Trees to the KickStart Tree plugin.
>
> For all options, plugins (Remotes) need to limit sync to affect only those
> content types within their domain.  For example, the RPM (Remote) sync
> cannot add/remove ISO or KS Trees.
>
> I am an advocate of some from of options #1 or #2.  Combining plugins
> (Remotes) as needed to deal with arbitrary combinations within remote
> repositories seems very powerful; does not impose complexity on plugin
> writers; and does not introduce code dependencies between plugins.
>
> Thoughts?
>
> ___
> Pulp-dev mailing list
> Pulp-dev@redhat.com
> https://www.redhat.com/mailman/listinfo/pulp-dev
>
>
___
Pulp-dev mailing list
Pulp-dev@redhat.com
https://www.redhat.com/mailman/listinfo/pulp-dev


[Pulp-dev] Composed Repositories

2018-05-14 Thread Jeff Ortel

Let's brainstorm on something.

Pulp needs to deal with remote repositories that are composed of 
multiple content types which may span the domain of a single plugin.  
Here are a few examples.  Some Red Hat RPM repositories are composed of: 
RPMs, DRPMs, , ISOs and Kickstart Trees.  Some OSTree repositories are 
composed of OSTrees & Kickstart Trees. This raises a question:


How can pulp3 best support syncing with remote repositories that are 
composed of multiple (unrelated) content types in a way that doesn't 
result in plugins duplicating support for content types?


Few approaches come to mind:

1. Multiple plugins (Remotes) participate in the sync flow to produce a 
new repository version.
2. Multiple plugins (Remotes) are sync'd successively each producing a 
new version of a repository.  Only the last version contains the fully 
sync'd composition.

3. Plugins share code.
4. Other?


Option #1: Sync would be orchestrated by core or the user so that 
multiple plugins (Remotes) participate in populating a new repository 
version.  For example: the RPM plugin (Remote) and the Kickstart Tree 
plugin (Remote) would both be sync'd against the same remote repository 
that is composed of both types.  The new repository version would be 
composed of the result of both plugin (Remote) syncs.  To support this, 
we'd need to provide a way for each plugin to operate seamlessly on the 
same (new) repository version.  Perhaps something internal to the 
RepositoryVersion. The repository version would not be marked "complete" 
until the last plugin (Remote) sync has succeeded.  More complicated 
than #2 but results in only creating truly complete versions or nothing. 
No idea how this would work with current REST API whereby plugins 
provide sync endpoints.


Option #2: Sync would be orchestrated by core or the user so that 
multiple plugins (Remotes) create successive repository versions.  For 
example: the RPM plugin (Remote) and the Kickstart Tree plugin (Remote) 
would both be sync'd against the same remote repository that is a 
composition including both types.  The intermediate versions would be 
incomplete. Only the last version contains the fully sync'd 
composition.  This approach can be supported by core today :) but will 
produce incomplete repository versions that are marked complete=True.  
This /seems/ undesirable, right?  This may not be a problem for 
distribution since I would imaging that only the last (fully composed) 
version would be published.  But what about other usages of the 
repository's "latest" version?


Option #3: requires a plugin to be aware of specific repository 
composition(s); other plugins and creates a code dependency between 
plugins.  For example, the RPM plugin could delegate ISOs to the File 
plugin and Kickstart Trees to the KickStart Tree plugin.


For all options, plugins (Remotes) need to limit sync to affect only 
those content types within their domain.  For example, the RPM (Remote) 
sync cannot add/remove ISO or KS Trees.


I am an advocate of some from of options #1 or #2.  Combining plugins 
(Remotes) as needed to deal with arbitrary combinations within remote 
repositories seems very powerful; does not impose complexity on plugin 
writers; and does not introduce code dependencies between plugins.


Thoughts?
___
Pulp-dev mailing list
Pulp-dev@redhat.com
https://www.redhat.com/mailman/listinfo/pulp-dev