On 06/27/2017 08:36 AM, Brian Bouterse wrote: > I thought that we pulled out the chunking uploads from the MVP. IIRC, @jortel > and I thought since that use > case was for high performing (parallel) uploads and it should be on the 3.1+ > page. > > +1 to just sending data without having a file handle. If the entire file is > delivered in one request then > having a file ID to upload to in a second request is just cumbersome. > +1 to having the handler receiving that file just make it an Artifact() right > away. This will work better with > how Django handles file uploads.
A few things about uploading directly to an Artifact: - The artifact FK to a content unit would need to become optional. - Need to add use cases for cleaning up artifacts not associated with a content unit. - The upload API would need additional information needed to create an artifact. Like relative path, size, checksums etc. - Since (I assume) you are proposing uploading/writing directly to artifact storage (not staging in a working dir), the flow would need to involve (optional) validation. If validation fails, the artifact must not be inserted into the DB. -jeff > > I also think we can skip making one Artifact from another. That is not going > to be a commonly used use case I > think. So removing that use case and chunking that would be: > > * As an authenticated user, I can upload a file which becomes an Artifact. > At the end up the of upload, the > server returns the JSON representation of the created Artifact. > * As an authenticated user, I can create a content unit by providing the > content type, its Artifacts using > IDs for each Artifact, and the metadata supplied in the POST body. This > call is atomic, content unit is > created in the database and on the filesystem or not at all. > > The biggest reason I think to do this adjustment is to aligns with the users > desire to have uploads take fewer > calls. This removes at least two calls from the workflow. It also avoids > having to save the data multiple > times which I don't think we can do practically. > > Thoughts or ideas? > > -Brian > > On Tue, Jun 27, 2017 at 8:55 AM, Dennis Kliban <dkli...@redhat.com > <mailto:dkli...@redhat.com>> wrote: > > My motivations for writing this email include: recent discussion about > pulp 2 upload API in #pulp and > django's documentation on file uploads. > > Files uploaded to Django are initially stored in memory (if under 2.5 mb) > or Python's tempfile module is > used to write it to /tmp/ directory. The file created in /tmp is deleted > when and if the last file handle > is closed. > > If we implement the upload API as described in the MVP doc[0], then > according to Django docs[1] we will be > performing a write to disk 2 or 3 times for each upload. In cases where a > file is bigger than 2.5mb in > size, it will be first written to /tmp. The same file will then be > written to /var/lib/pulp/uploads (or > similar location) when the FileUpload model is saved. A third write will > occur when an artifact is created > using the FileUpload. This third write will likely be a move though. > > I propose that we eliminate writing the uploaded file to > /var/lib/pulp/upload and go directly to creating > an artifact. The use cases can then be rewritten as the following: > > * As an authenticated user, I can upload a file with an optional chunk > size, and an optional offset. At > the end up the of upload the server returns the JSON representation > of the artifact. > > > * As an authenticated user, I can create a new artifact by specifying > an existing artifact id. > > > * As an authenticated user, I can create a content unit by providing > the content type, its Artifacts > using IDs for each Artifact, and the metadata supplied in the POST > body. This call is atomic, content > unit is created in the database and on the filesystem or not at all. > > > > > [0] > https://pulp.plan.io/projects/pulp/wiki/Pulp_3_Minimum_Viable_Product#Upload-amp-Copy > > <https://pulp.plan.io/projects/pulp/wiki/Pulp_3_Minimum_Viable_Product#Upload-amp-Copy> > [1] > https://docs.djangoproject.com/en/1.9/topics/http/file-uploads/#handling-uploaded-files-with-a-model > > <https://docs.djangoproject.com/en/1.9/topics/http/file-uploads/#handling-uploaded-files-with-a-model> > > _______________________________________________ > Pulp-dev mailing list > Pulp-dev@redhat.com <mailto:Pulp-dev@redhat.com> > https://www.redhat.com/mailman/listinfo/pulp-dev > <https://www.redhat.com/mailman/listinfo/pulp-dev> > > > > > _______________________________________________ > Pulp-dev mailing list > Pulp-dev@redhat.com > https://www.redhat.com/mailman/listinfo/pulp-dev >
signature.asc
Description: OpenPGP digital signature
_______________________________________________ Pulp-dev mailing list Pulp-dev@redhat.com https://www.redhat.com/mailman/listinfo/pulp-dev