On 2/22/19 12:07 PM, Brian Bouterse wrote:


On Fri, Feb 22, 2019 at 9:36 AM Justin Sherrill <jsher...@redhat.com <mailto:jsher...@redhat.com>> wrote:


    On 2/18/19 2:41 PM, Austin Macdonald wrote:
    Originally, our upload story was as follows:
    The user will upload a new file to Pulp via POST to /artifacts/
    (provided by core)
    The user will create a new plugin specific Content via POST to
    /path/to/plugin/content/, referencing whatever artifacts that are
    contained, and whatever fields are expected for the new content.
    The user will add the new content to a repository via POST to
    /repositories/1/versions/

    However, this is somewhat cumbersome to the user with 3 API calls
    to accomplish something that only took one call in Pulp 2.

    How would you do this with one call in pulp2?
    
https://docs.pulpproject.org/dev-guide/integration/rest-api/content/upload.html
    seems to suggest 3-4 calls.

Some plugins implemented the pulp2 equivalent of a one-shot uploader. Those docs are for pulp2's core which don't include the plugin's docs.


    There are a couple of different paths plugins have taken to
    improve the user experience:
    The Python plugin follows the above workflow, but reads the
    Artifact file to determine the values for the fields. The RPM
    plugin has gone even farther and created a new endpoint for "one
    shot" upload that perform all of this in a single call. I think
    it is likely that the Python plugin will move more in the "one
    shot" direction, and other plugins will probably follow.

    How does the RPM one shot api work?  Will it be compatible with
    whatever solution https://pulp.plan.io/issues/4196 arrives at?

You would upload the Artifact as binary data along with what content type it is and what relative path it uses and Pulp creates the Artifact, Content unit, ContentArtifact. It should be compatible with issue 4196 because django's binary form data should allow for parallel uploading before calling the view handler. It may take 2 calls though. The issue to me isn't about the number of calls as it is the client data payload complexity.
If i'm having to chunk up data, i already have quite a bit of client data payload complexity.  In pulp 2 this was most of the complexity!

    I would hate for all our plugins to move to One shot methods which
    users can't even rely on.

I don't think we're taking the "generic" uploading away. You can always rely on that. The issue w/ one-shot is that it's not possible (literally) for many content types, e.g. Artifact-less content. It's also hard for multi-artifact Content so that would probably still be something plugin writers would provide as a custom thing for their content type. Regardless it's just not possible to have consistency in this area.

Why is it not possible to create a one-shot upload for artifact-less content?  (maybe we're defining what a one-shot upload actually is differently, i'm reading it as something that combines multiple steps into one)

Why is consistency not possible? I guess i don't see a huge variation of upload scenarios beyond:

1.  upload Zero to many files as artifacts

2.  Provide some metadata about the zero or more artifacts or let the plugin parse it out themselves (or maybe even a combination of the two)

3.  Import that unit into a repository.

I can see it being difficult as a user to go through all of those steps (even if 2 & 3 were combined into one), and the desire is to simplify the process, but uploading arbitrary files is not simple.   Why do i need to give up the plugin's ability to parse the unit's details because i'm using the consistent api?

Keep in mind all my questions are coming from a very ignorant perspective with respect of pulp3 internals, and more from a user perspective.

    My problem with single api calls to upload files is that we cannot
    reliably use them due to limitation in request sizes.  We have to
    be prepared to use multiple calls to upload files regardless. 
    Maybe if a user is using some plugin that never has super large
    files (ansible?) you could be confident you would never hit a
    request size limitation.   But file, docker, and yum all would
    require multiple calls to get the physical data to the server.

I believe arbitrarily large files can be uploaded either through multi-part form data or through the django-chunked interface. We'll see what happens with 4196, but I expect arbitrary payload size to be a requirement for Pulp users.

    I care more about having a consistent method for uploading files
    than having fewer api calls.   If we need a some content specific
    api, that's fine, but please make it a consistent part of the
    process.

It sounds like the 4-call interface is the only choice then if consistency is a must. There isn't a way to offer consistency for one-shot uploaders. Is it ok that Katello will have to fill out all of the field data when you post the content type? What could be better?

I'll reserve my comments here based on the discussion above.

Thanks!

Justin


    I feel like we may be chasing the wrong goal here (fewer calls vs
    a more consistent experience).


    That said, I think we should discuss this as a community to
    encourage plugins to behave similarly, and because there may also
    be a possibility for sharing some of code. It is my hope that a
    "one shot upload" could do 2 things: 1) Upload and create
    Content. 2) Optionally add that content to repositories.

    _______________________________________________
    Pulp-dev mailing list
    Pulp-dev@redhat.com  <mailto:Pulp-dev@redhat.com>
    https://www.redhat.com/mailman/listinfo/pulp-dev
    _______________________________________________
    Pulp-dev mailing list
    Pulp-dev@redhat.com <mailto:Pulp-dev@redhat.com>
    https://www.redhat.com/mailman/listinfo/pulp-dev

_______________________________________________
Pulp-dev mailing list
Pulp-dev@redhat.com
https://www.redhat.com/mailman/listinfo/pulp-dev

Reply via email to