This is an automated email from the ASF dual-hosted git repository.

acosentino pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/camel-website.git


The following commit(s) were added to refs/heads/main by this push:
     new f5962ea  Added S3 Streaming upload post
f5962ea is described below

commit f5962eac98f029cd2d7d66a81f7055efe5a21773
Author: Andrea Cosentino <[email protected]>
AuthorDate: Tue Apr 20 11:49:54 2021 +0200

    Added S3 Streaming upload post
---
 .../s3-streaming-upload-3.10.0/camel-featured.jpeg | Bin 0 -> 625206 bytes
 .../2021/04/s3-streaming-upload-3.10.0/index.md    |  84 +++++++++++++++++++++
 2 files changed, 84 insertions(+)

diff --git 
a/content/blog/2021/04/s3-streaming-upload-3.10.0/camel-featured.jpeg 
b/content/blog/2021/04/s3-streaming-upload-3.10.0/camel-featured.jpeg
new file mode 100644
index 0000000..36c23a0
Binary files /dev/null and 
b/content/blog/2021/04/s3-streaming-upload-3.10.0/camel-featured.jpeg differ
diff --git a/content/blog/2021/04/s3-streaming-upload-3.10.0/index.md 
b/content/blog/2021/04/s3-streaming-upload-3.10.0/index.md
new file mode 100644
index 0000000..4137952
--- /dev/null
+++ b/content/blog/2021/04/s3-streaming-upload-3.10.0/index.md
@@ -0,0 +1,84 @@
+---
+title: "Camel-AWS-S3 - New Streaming upload feature"
+date: 2021-04-20
+authors: ["oscerd"]
+categories: ["Features", "Camel"]
+preview: "AWS S3 Streaming upload"
+summary: "The S3 Streaming upload feature will arrive on Camel 3.10.0"
+---
+
+In the last weeks I was focused on a particular feature for the Camel AWS S3 
component: the streaming upload feature.
+
+In this post I'm going to summarize what it is an how to use it.
+
+## Streaming upload
+
+The AWS S3 component had already a multipart upload feature in his producer 
operations: the main "problem" with it, was the need of knowing the size of the 
upload ahead of time. 
+
+The streaming upload feature coming in Camel 3.10.0 won't need to know the 
size before starting the upload.
+
+### How it works
+
+Obviously this feature has been implemented on the S3 component producer side.
+
+The idea is to continuously send data to the producer and batching the 
messages. On the endpoint you'll have three possible way of stopping the 
batching:
+- timeout
+- buffer size
+- batch size
+
+Buffer size and batch size will work together, this means that the batch will 
be completed when the batch size is complete or when the set buffer size has 
been excedeed.
+
+With the timeout in the picture the batching will be stopped and the upload 
completed (also) when the timeout will be reached.
+
+### S3 Files naming
+
+In the streaming upload producer two different naming strategy are provided:
+- progressive
+- random
+
+The progressive one will add a progressive suffix to the uploaded part, while 
the random one will add a random id as keyname suffix.
+
+If the S3 key name you'll specify on your endpoint will be 
"file_upload_part.txt", during the upload you can expect a list like:
+
+- file_upload_part.txt
+- file_upload_part-1.txt
+- file_upload_part-2.txt
+
+and so on.
+
+The progressive naming strategy will make you ask how does it work when I stop 
and restart the route?
+
+### Restarting Strategies
+
+The restarting strategies provided in the S3 Streaming upload producer are:
+
+- lastPart
+- override
+
+the lastPart strategy will make sense only in combination with the progressive 
naming strategy, obviously.
+
+At the time of restarting the route, the producer will check for the S3 
keyname prefix in the bucket specified and get the last index uploaded.
+
+The index will be used to start again from the same point.
+
+### Sample
+
+This feature is very nice to see in action.
+
+In the [camel-examples 
repository](https://github.com/apache/camel-examples/tree/master/examples/aws/main-endpointdsl-kafka-aws2-s3-restarting-policy)
 I added an example of the feature with Kafka as consumer.
+
+The example will poll one kafka topic s3.topic.1 and upload batch of 25 
messages (or 1 Mb batch) as single file into an s3 bucket (mycamel-1).
+
+In the [how to run 
section](https://github.com/apache/camel-examples/tree/master/examples/aws/main-endpointdsl-kafka-aws2-s3-restarting-policy#how-to-run)
 of the README it is explained well how to ingest data to your Kafka broker.
+
+### Conclusion
+
+The streaming upload feature will be useful in situation where the user don't 
know the amount of data he wants to upload to S3, but also when he just wants 
to ingest data continuously without having to care about the size.
+
+There is probably more work to do, but this can be a feature to introduce even 
in other storage components we have in Apache Camel. 
+
+
+
+
+
+

Reply via email to