Nilesh,

The issue is you're running out of space on the disk. Ask your devops team
to provision a lot more space for that the partition where the content
repository resides. Adding an extra 500GB should give you more than enough
space to cover it and a little buffer in case you want to do something else
with it that mutates the data.

On Tue, May 9, 2023 at 2:53 PM Joe Witt <joe.w...@gmail.com> wrote:

> Nilesh,
>
> These processors generally are not memory sensitive as they should
> only ever have small amounts in memory at a time so it is likely this
> should work well up to 100s of GB objects and so on.  We of course dont
> really test that much but it is technically reasonable and designed as
> such.  So what would be the bottleneck?  It is exactly what Eric is
> flagging.
>
> You will need a large content repository available large enough to hold as
> much data in flight as you'll have at any one time.  It looks like you have
> single files as large as 400GB with some being 100s or 10s of GB as well
> and I'm guessing many can happen at/around one time.  So you'll need a far
> larger content repository than you're currently using.  It shows that free
> space on any single node is on average 140GB which means you have very
> little head room for what you're trying to do.  You should try to have a TB
> or more available for this kind of case (per node).
>
> You mention it fails but please provide information showing how/the logs.
>
> Also please do not use load balancing on every connection.  You want to
> use that feature selectively/by design choices.  For now - I'd avoid it
> entirely or just use it between listing and fetching.  But certainly not
> after fetching given how massive the content is that would have to be
> shuffled around.
>
> Thanks
>
> On Tue, May 9, 2023 at 9:07 AM Kumar, Nilesh via users <
> users@nifi.apache.org> wrote:
>
>> Hi Eric
>>
>>
>>
>> I see following for my content Repository. Can you please help me on how
>> to tweak it further. I have deployed nifi on K8s with 3 replica pod
>> cluster, with no limit of resource. But I guess the pod  cpu/memory will be
>> throttled by node capacity itself. I noticed that single I have one single
>> file as 400GB all the load goes to any one of the node that picks up the
>> transfer. I wanted to know if we can do this any other way of configuring
>> the flow. If not please tell me the metrics for nifi to tweak.
>>
>>
>>
>> *From:* Eric Secules <esecu...@gmail.com>
>> *Sent:* Tuesday, May 9, 2023 9:26 PM
>> *To:* users@nifi.apache.org; Kumar, Nilesh <nileshkum...@deloitte.com>
>> *Subject:* [EXT] Re: Need Help in migrating Giant CSV from S3 to SFTP
>>
>>
>>
>> Hi Nilesh,
>>
>>
>>
>> Check the size of your content repository. If you want to transfer a
>> 400GB file through nifi, your content repository must be greater than
>> 400GB, someone else might have a better idea of how much bigger you need.
>> But generally it all depends on how many of these big files you want to
>> transfer at the same time. You can check the content repository metrics in
>> the Node Status from the hamburger menu in the top right corner of the
>> canvas.
>>
>>
>>
>> -Eric
>>
>>
>>
>> On Tue., May 9, 2023, 8:42 a.m. Kumar, Nilesh via users, <
>> users@nifi.apache.org> wrote:
>>
>> Hi Team,
>>
>> I want to move a very large file like 400GB from S3 to SFTP. I have used
>> listS3 -> FetchS3 -> putSFTP. This works for smaller files till 30GB but
>> fails for larger(100GB) files. Is there any way to configure this flow so
>> that it handles very large single file. If there is any template that
>> exists please share.
>>
>> My configuration are all standard processor configuration.
>>
>>
>>
>> Thanks,
>>
>> Nilesh
>>
>>
>>
>>
>>
>>
>>
>> This message (including any attachments) contains confidential
>> information intended for a specific individual and purpose, and is
>> protected by law. If you are not the intended recipient, you should delete
>> this message and any disclosure, copying, or distribution of this message,
>> or the taking of any action based on it, by you is strictly prohibited.
>>
>> Deloitte refers to a Deloitte member firm, one of its related entities,
>> or Deloitte Touche Tohmatsu Limited ("DTTL"). Each Deloitte member firm is
>> a separate legal entity and a member of DTTL. DTTL does not provide
>> services to clients. Please see www.deloitte.com/about to learn more.
>>
>> v.E.1
>>
>>

Reply via email to