On 2/14/20 1:09 PM, David Davis wrote:
Grant and I met today to discuss importers and exporters[0] and we'd
like some feedback before we proceed with the design. To sum up this
feature briefly: users can export a repository version from one Pulp
instance and import it to another.
# Master/Detail vs Core
So one fundamental question is whether we should use a Master/Detail
approach or just have core control the flow but call out to plugins to
get export formats.
To give some background: we currently define Exporters (ie
FileSystemExporter) in core as Master models. Plugins extend this
model which allows them to configure or customize the Exporter. This
was necessary because some plugins need to export Publications (along
with repository metadata) while other plugins who don't have
Publications or metadata export RepositoryVersions.
The other option is to have core handle the workflow. The user would
call a core endpoint and provide a RepositoryVersion. This would work
because for importing/exporting, you wouldn't ever use Publications
because metadata won't be used for importing back into Pulp. If
needed, core could provide a way for plugin writers to write custom
handlers/exporters for content types.
If we go with the second option, the question then becomes whether we
should divorce the concept of Exporters and import/export. Or do we
also switch Exporters from Master/Detail to core only?
# Foreign Keys
Content can be distributed across multiple tables (eg UpdateRecord has
UpdateCollection, etc). In our export, we could either use primary
keys (UUIDs) or natural keys to relate records. The former assumes
that UUIDs are unique across Pulp instances. The safer but more
complex alternative is to use natural keys. This would involve storing
a set of fields on a record that would be used to identify a related
record.
# Incremental Exports
There are two big pieces of data contained in an export: the dataset
of Content from the database and the artifact files. An incremental
export cuts down on the size of an export by only exporting the
differences. However, when performing an incremental export, we could
still export the complete dataset instead of just a set of differences
(additions/removals/updates). This approach would be simpler and it
would allow us to ensure that the new repo version matches the
exported repo version exactly. It would however increase the export
size but not by much I think--probably some number of megabytes at most.
If its simper, i would go with that. Saving even ~100-200 MB isn't that
big of a deal IMO. the biggest savings is in the RPM content.
[0] https://pulp.plan.io/issues/6134
David
_______________________________________________
Pulp-dev mailing list
Pulp-dev@redhat.com
https://www.redhat.com/mailman/listinfo/pulp-dev
_______________________________________________
Pulp-dev mailing list
Pulp-dev@redhat.com
https://www.redhat.com/mailman/listinfo/pulp-dev