Hi This is on the roadmap with something called resume strategy and kamelets. Then the resume strategy will be implemented on components that can resume from a specific point, eg a file/ftp from N bytes into the file. A kamelet can then for kafka connector handle all of this as source to use resume strategy, and split etc.
There is some basic implementation today but we are working on a v2 with a more generic design. A draft prototype is in the works at https://github.com/apache/camel/pull/6947 Otavio knows more about this, so keep an eye out for it, and provide feedback here. On Wed, Feb 9, 2022 at 4:47 PM Sergio Meana <sme...@darksecurities.com> wrote: > > Hello Raymond > > Many thanks for your quick response. > > 1) How big are the files? Is there a specific threshold when doesn't it work > anymore? > > Files are between 250MB and 500MB > > 2) What kind of error do you get and where do you get it? (On Camel side or > Kafka side, is it memory related)? > > I am getting message.max.bytes, we can't change the config to have it bigger > that 1MB (company requirements). > I tested the connector with small files of Kbytes and it sends the whole file > in one record to Kafka. We want to send the file in small batches and have an > atomic transaction. If the connector fails in the middle just keep track to > don't send one more time the same information or lost data. > > 3) What kind of files are you using (csv, xml, json)? > > Plain text. It is info data or logs. > > 4) Did you try to use the split pattern > (https://camel.apache.org/components/3.14.x/eips/split-eip.html)? > > Can it be integrated with the SFTP source connector? Is any example available? > > Many thanks > Sergio > > -------- Original Message -------- > On 3 Feb 2022, 12:51, ski n wrote: > > > Hi Sergio, > > > > Can you tell a bit more about your use case? > > > > 1) How big are the files? Is there a specific threshold when doesn't it > > work anymore? > > 2) What kind of error do you get and where do you get it? (On Camel side or > > Kafka side, is it memory related)? > > 3) What kind of files are you using (csv, xml, json)? > > 4) Did you try to use the split pattern > > (https://camel.apache.org/components/3.14.x/eips/split-eip.html)? > > > > More in general: I know often you have no influence in the files that are > > at the source, but working message-based (instead of file-based like SFTP) > > and with smaller messages (so the load is spread evenly over time) and > > real-time/event-based (send change when it occurs) mostly is the best > > approach. If not, you mostly end up with such issues when scaling. > > > > Kind regards, > > > > Raymond > > > > On Thu, Feb 3, 2022 at 11:19 AM Sergio Meana <sme...@darksecurities.com> > > wrote: > > > >> Hello > >> > >> We are trying to read files using the following connector > >> https://github.com/apache/camel-kafka-connector-examples/tree/main/sftp/sftp-source > >> The connector send the whole file in one record to kafka which it is > >> failing with big files > >> > >> It is possible to override the apply method to convert the whole record > >> into small batches? > >> > >> public R apply(R r) { > >> > >> We unsuccessfully tried a couple of thing but not luck > >> > >> Thanks in advance > >> > >> [Emails signature size.png] > >> Sergio Meana > >> Chief Quant Developer| Data Scientist > >> DARK SECURITIES LTD > >> Email smeana[@](mailto:mario.lebr...@bcssmz.org)darksecurities.com > >> Tel +44 7732581034 > >> > >> This email and any files transmitted with it are confidential and may be > >> privileged or otherwise protected from disclosure, and are intended solely > >> for the use of the individual or entity to whom they are addressed. If you > >> received this email in error please notify the sender immediately and > >> permanently delete all copies of this email and its attachments from your > >> system. Any unauthorised use, copying, distribution, or disclosure of this > >> email and its attachments is prohibited; to the fullest extent permitted > >> by law Dark Securities LTD does not accept any liability or responsibility > >> with respect to any action taken by you in reliance on the contents of > >> this email and its attachments. -- Claus Ibsen ----------------- http://davsclaus.com @davsclaus Camel in Action 2: https://www.manning.com/ibsen2