issue in Migrating a template from one server to another

2023-05-09 Thread Urmila Pal
Hello Team,

I am trying to migrate few process group of my canvas to Another NiFi Server. I 
have created a template of that process group and trying to import it on 
another server.

While doing so its saying following message:
[cid:image001.png@01D98298.A7B61C30]

I have also replaced the flow.xml.gz and flow.json.gz file from this server to 
another one, still its saying the same message.


Could you please guide me through this?

Thank you,
Urmila Pal




RE: Need Help in migrating Giant CSV from S3 to SFTP

2023-05-09 Thread Kumar, Nilesh via users
Hi Eric

I see following for my content Repository. Can you please help me on how to 
tweak it further. I have deployed nifi on K8s with 3 replica pod cluster, with 
no limit of resource. But I guess the pod  cpu/memory will be throttled by node 
capacity itself. I noticed that single I have one single file as 400GB all the 
load goes to any one of the node that picks up the transfer. I wanted to know 
if we can do this any other way of configuring the flow. If not please tell me 
the metrics for nifi to tweak.
[cid:image001.png@01D982BE.636CEBA0]

From: Eric Secules 
Sent: Tuesday, May 9, 2023 9:26 PM
To: users@nifi.apache.org; Kumar, Nilesh 
Subject: [EXT] Re: Need Help in migrating Giant CSV from S3 to SFTP

Hi Nilesh,

Check the size of your content repository. If you want to transfer a 400GB file 
through nifi, your content repository must be greater than 400GB, someone else 
might have a better idea of how much bigger you need. But generally it all 
depends on how many of these big files you want to transfer at the same time. 
You can check the content repository metrics in the Node Status from the 
hamburger menu in the top right corner of the canvas.

-Eric

On Tue., May 9, 2023, 8:42 a.m. Kumar, Nilesh via users, 
mailto:users@nifi.apache.org>> wrote:

Hi Team,

I want to move a very large file like 400GB from S3 to SFTP. I have used listS3 
-> FetchS3 -> putSFTP. This works for smaller files till 30GB but fails for 
larger(100GB) files. Is there any way to configure this flow so that it handles 
very large single file. If there is any template that exists please share.

My configuration are all standard processor configuration.



Thanks,

Nilesh


[cid:image001.png@01D982BA.7FCD4FF0]

[cid:image002.png@01D982BA.7FCD4FF0]

This message (including any attachments) contains confidential information 
intended for a specific individual and purpose, and is protected by law. If you 
are not the intended recipient, you should delete this message and any 
disclosure, copying, or distribution of this message, or the taking of any 
action based on it, by you is strictly prohibited.

Deloitte refers to a Deloitte member firm, one of its related entities, or 
Deloitte Touche Tohmatsu Limited ("DTTL"). Each Deloitte member firm is a 
separate legal entity and a member of DTTL. DTTL does not provide services to 
clients. Please see www.deloitte.com/about to 
learn more.

v.E.1


Re: Need Help in migrating Giant CSV from S3 to SFTP

2023-05-09 Thread Eric Secules
Hi Nilesh,

Based on that graph, I think each node in your cluster only has 173GB of
content storage. It makes sense that you're having trouble transferring
files greater than 100GB. Depending on the node it's assigned to and what
else is going on in the cluster it may not be able to evict enough other
content claims to create one for your 100GB or larger files. To do the file
transfer through nifi you must increase the size of the content repository
on each of your nifi nodes so that it's bigger than 400GB plus a wide
safety margin. You should also put the FetchS3 -> putSFTP part of your flow
within a process group and configure its concurrency settings so it only
allows one flowfile per node into the process group at a time. There are no
other processors or flow settings I know of which will allow you to stream
the file transfer in smaller chunks, so I'm afraid there is no flow-only
solution to this problem.

Here's some helpful additional reading:
https://community.cloudera.com/t5/Community-Articles/Understanding-how-NiFi-s-Content-Repository-Archiving-works/ta-p/249418

-Eric

On Tue., May 9, 2023, 9:07 a.m. Kumar, Nilesh, 
wrote:

> Hi Eric
>
>
>
> I see following for my content Repository. Can you please help me on how
> to tweak it further. I have deployed nifi on K8s with 3 replica pod
> cluster, with no limit of resource. But I guess the pod  cpu/memory will be
> throttled by node capacity itself. I noticed that single I have one single
> file as 400GB all the load goes to any one of the node that picks up the
> transfer. I wanted to know if we can do this any other way of configuring
> the flow. If not please tell me the metrics for nifi to tweak.
>
>
>
> *From:* Eric Secules 
> *Sent:* Tuesday, May 9, 2023 9:26 PM
> *To:* users@nifi.apache.org; Kumar, Nilesh 
> *Subject:* [EXT] Re: Need Help in migrating Giant CSV from S3 to SFTP
>
>
>
> Hi Nilesh,
>
>
>
> Check the size of your content repository. If you want to transfer a 400GB
> file through nifi, your content repository must be greater than 400GB,
> someone else might have a better idea of how much bigger you need. But
> generally it all depends on how many of these big files you want to
> transfer at the same time. You can check the content repository metrics in
> the Node Status from the hamburger menu in the top right corner of the
> canvas.
>
>
>
> -Eric
>
>
>
> On Tue., May 9, 2023, 8:42 a.m. Kumar, Nilesh via users, <
> users@nifi.apache.org> wrote:
>
> Hi Team,
>
> I want to move a very large file like 400GB from S3 to SFTP. I have used
> listS3 -> FetchS3 -> putSFTP. This works for smaller files till 30GB but
> fails for larger(100GB) files. Is there any way to configure this flow so
> that it handles very large single file. If there is any template that
> exists please share.
>
> My configuration are all standard processor configuration.
>
>
>
> Thanks,
>
> Nilesh
>
>
>
>
>
>
>
> This message (including any attachments) contains confidential information
> intended for a specific individual and purpose, and is protected by law. If
> you are not the intended recipient, you should delete this message and any
> disclosure, copying, or distribution of this message, or the taking of any
> action based on it, by you is strictly prohibited.
>
> Deloitte refers to a Deloitte member firm, one of its related entities, or
> Deloitte Touche Tohmatsu Limited ("DTTL"). Each Deloitte member firm is a
> separate legal entity and a member of DTTL. DTTL does not provide services
> to clients. Please see www.deloitte.com/about to learn more.
>
> v.E.1
>
>


Re: Need Help in migrating Giant CSV from S3 to SFTP

2023-05-09 Thread Joe Witt
Nilesh,

These processors generally are not memory sensitive as they should
only ever have small amounts in memory at a time so it is likely this
should work well up to 100s of GB objects and so on.  We of course dont
really test that much but it is technically reasonable and designed as
such.  So what would be the bottleneck?  It is exactly what Eric is
flagging.

You will need a large content repository available large enough to hold as
much data in flight as you'll have at any one time.  It looks like you have
single files as large as 400GB with some being 100s or 10s of GB as well
and I'm guessing many can happen at/around one time.  So you'll need a far
larger content repository than you're currently using.  It shows that free
space on any single node is on average 140GB which means you have very
little head room for what you're trying to do.  You should try to have a TB
or more available for this kind of case (per node).

You mention it fails but please provide information showing how/the logs.

Also please do not use load balancing on every connection.  You want to use
that feature selectively/by design choices.  For now - I'd avoid it
entirely or just use it between listing and fetching.  But certainly not
after fetching given how massive the content is that would have to be
shuffled around.

Thanks

On Tue, May 9, 2023 at 9:07 AM Kumar, Nilesh via users <
users@nifi.apache.org> wrote:

> Hi Eric
>
>
>
> I see following for my content Repository. Can you please help me on how
> to tweak it further. I have deployed nifi on K8s with 3 replica pod
> cluster, with no limit of resource. But I guess the pod  cpu/memory will be
> throttled by node capacity itself. I noticed that single I have one single
> file as 400GB all the load goes to any one of the node that picks up the
> transfer. I wanted to know if we can do this any other way of configuring
> the flow. If not please tell me the metrics for nifi to tweak.
>
>
>
> *From:* Eric Secules 
> *Sent:* Tuesday, May 9, 2023 9:26 PM
> *To:* users@nifi.apache.org; Kumar, Nilesh 
> *Subject:* [EXT] Re: Need Help in migrating Giant CSV from S3 to SFTP
>
>
>
> Hi Nilesh,
>
>
>
> Check the size of your content repository. If you want to transfer a 400GB
> file through nifi, your content repository must be greater than 400GB,
> someone else might have a better idea of how much bigger you need. But
> generally it all depends on how many of these big files you want to
> transfer at the same time. You can check the content repository metrics in
> the Node Status from the hamburger menu in the top right corner of the
> canvas.
>
>
>
> -Eric
>
>
>
> On Tue., May 9, 2023, 8:42 a.m. Kumar, Nilesh via users, <
> users@nifi.apache.org> wrote:
>
> Hi Team,
>
> I want to move a very large file like 400GB from S3 to SFTP. I have used
> listS3 -> FetchS3 -> putSFTP. This works for smaller files till 30GB but
> fails for larger(100GB) files. Is there any way to configure this flow so
> that it handles very large single file. If there is any template that
> exists please share.
>
> My configuration are all standard processor configuration.
>
>
>
> Thanks,
>
> Nilesh
>
>
>
>
>
>
>
> This message (including any attachments) contains confidential information
> intended for a specific individual and purpose, and is protected by law. If
> you are not the intended recipient, you should delete this message and any
> disclosure, copying, or distribution of this message, or the taking of any
> action based on it, by you is strictly prohibited.
>
> Deloitte refers to a Deloitte member firm, one of its related entities, or
> Deloitte Touche Tohmatsu Limited ("DTTL"). Each Deloitte member firm is a
> separate legal entity and a member of DTTL. DTTL does not provide services
> to clients. Please see www.deloitte.com/about to learn more.
>
> v.E.1
>
>