Prabhu,

Certainly, the performance that you are seeing, taking 4-5 hours to move 3M 
rows into SQLServer is far from
ideal, but the good news is that it is also far from typical. You should be 
able to see far better results.

To help us understand what is limiting the performance, and to make sure that 
we understand what you are seeing, 
I have a series of questions that would help us to understand what is going on.

The first processor that you see that exhibits poor performance is ExtractText, 
correct?
Can you share the configuration that you have for that processor?

How big is your Java heap? This is configured in conf/bootstrap.conf; by 
default it is configured as:
java.arg.2=-Xms512m
java.arg.3=-Xmx512m

Do you have backpressure configured on the connection between ExtractText and 
ReplaceText?

Also, when you say that you specify concurrent tasks, what are you configuring 
the concurrent tasks
to be? Have you changed the maximum number of concurrent tasks available to 
your dataflow? By default, NiFi will
use only 10 threads max. How many CPU's are available on this machine?

And finally, are these the only processors in your flow, or do you have other 
dataflows going on in the
same instance as NiFi?

Thanks
-Mark


> On Oct 17, 2016, at 3:35 AM, prabhu Mahendran <prabhuu161...@gmail.com> wrote:
> 
> Hi All,
> 
> I have tried to perform the below operation.
> 
> dat file(input)-->JSON-->SQL-->SQLServer
> 
> 
> GetFile-->SplitText-->SplitText-->ExtractText-->ReplaceText-->ConvertJsonToSQL-->PutSQL.
> 
> My Input File(.dat)-->3,00,000 rows.
> 
> Objective: Move the data from '.dat' file into SQLServer.
> 
> I can able to Store the data in SQL Server by using combination of above 
> processors.But it takes almost 4-5 hrs to move complete data into SQLServer.
> 
> Combination of SplitText's perform data read quickly.But Extract Text takes 
> long time to pass given data matches with user defined expression.If input 
> comes 107 MB but it send outputs in KB size only even ReplaceText processor 
> also processing data in KB Size only.
> 
> In accordance with above slow processing leads the more time taken for data 
> into SQLsever. 
> 
> 
> Extract Text,ReplaceText,ConvertJsonToSQL processors send's outgoing flow 
> file in Kilobytes only.
> 
> If i have specify concurrent tasks for those 
> ExtractText,ReplaceText,ConvertJsonToSQL then it occupy the 100% cpu and disk 
> usage.
> 
> It just 30 MB data ,But processors takes 6 hrs for data movement into 
> SQLServer.
> 
> Faced Problem is..,
> 
>        Almost 6 hrs taken for move the 3lakhs data into SQL Server.
>        ExtractText,ReplaceText take long time for processing data(it send 
> output flowfile kb size only).
> Can anyone help me to solve below requirement?
> 
> Need to reduce the number of time taken by the processors for move the lakhs 
> of data into SQL Server.
> 
> 
> 
> If anything i'm done wrong,please help me to done it right.
> 
> 
> 
> 
> 
> 
> 

Reply via email to