Re: How to know the processors last running time?

2018-03-19 Thread prabhu Mahendran
Thank you for the help.

On Mon, Mar 19, 2018 at 5:13 PM, Jorge Machado <jom...@me.com> wrote:

> From Nifi default I don’t know any other way. You can always create a
> entry on a table and write the information that you need into it.
>
> Jorge
>
>
> On 19 Mar 2018, at 12:22, prabhu Mahendran <prabhuu161...@gmail.com>
> wrote:
>
> Thanks for the update.
>
> Is there any other option other than logAttribute?
>
> On Mon, Mar 19, 2018 at 3:50 PM, Jorge Machado <jom...@me.com> wrote:
>
>> you could end your flow with a logAttribute Processor for example. Would
>> that resolve your issues ?
>>
>> Jorge Machado
>>
>>
>>
>>
>>
>>
>> On 19 Mar 2018, at 07:57, prabhu Mahendran <prabhuu161...@gmail.com>
>> wrote:
>>
>> i'm having group of processors to perform some functionalities.
>>
>> I have scheduled the starting processor to be run upon daily at
>> particular time by using cron expression.
>>
>> I need to know that the processors last running time.
>>
>> just consider an example if i having getfile,generateflowfile processor
>> triggers entire workflow by cron. At some times, i enter into that workflow
>> not able to ensure that processor running as per cron or not because it
>> entire processing completed at that time.
>>
>> can anyone suggest me the best way to ensure the processors last running
>> time and its processing attributes?.
>>
>>
>>
>
>


Re: How to know the processors last running time?

2018-03-19 Thread prabhu Mahendran
Thanks for the update.

Is there any other option other than logAttribute?

On Mon, Mar 19, 2018 at 3:50 PM, Jorge Machado <jom...@me.com> wrote:

> you could end your flow with a logAttribute Processor for example. Would
> that resolve your issues ?
>
> Jorge Machado
>
>
>
>
>
>
> On 19 Mar 2018, at 07:57, prabhu Mahendran <prabhuu161...@gmail.com>
> wrote:
>
> i'm having group of processors to perform some functionalities.
>
> I have scheduled the starting processor to be run upon daily at particular
> time by using cron expression.
>
> I need to know that the processors last running time.
>
> just consider an example if i having getfile,generateflowfile processor
> triggers entire workflow by cron. At some times, i enter into that workflow
> not able to ensure that processor running as per cron or not because it
> entire processing completed at that time.
>
> can anyone suggest me the best way to ensure the processors last running
> time and its processing attributes?.
>
>
>


How to know the processors last running time?

2018-03-19 Thread prabhu Mahendran
i'm having group of processors to perform some functionalities.

I have scheduled the starting processor to be run upon daily at particular
time by using cron expression.

I need to know that the processors last running time.

just consider an example if i having getfile,generateflowfile processor
triggers entire workflow by cron. At some times, i enter into that workflow
not able to ensure that processor running as per cron or not because it
entire processing completed at that time.

can anyone suggest me the best way to ensure the processors last running
time and its processing attributes?.


Is this possible to create table from JSON?

2018-02-13 Thread prabhu Mahendran
Hi All,

I can able to get avro schema from json.

It could be look like below.

{
"type" : "array",
"items" : {
"type" : "record",
"name" : "AvroSchema",
"fields" : [
{
"name" : "Name",
"type" : "String",
"doc" : "Type inferred from 'S'"
},
{ "name" : "Age",
"type" : "int", "doc" : "Type inferred from '23'"
}
}
}

With help of that, i should extract fields(name & type) and create table in
DB from that schema.

Can any one suggest me the best way to create table in SQL Server from NiFi
based on avro schema?


Re: Nifi: how to split logs?

2017-11-26 Thread prabhu Mahendran
Sally,

Try this link (
https://community.hortonworks.com/articles/131320/using-partitionrecord-grokreaderjsonwriter-to-pars.html)
.


*GrokReader which helps to read log file with help of GrokExpression and in
which you can parse by only error,info and warn data by using
RouteOnAttribute.*
Hope this helpful for you.

On Mon, Nov 27, 2017 at 10:54 AM, sally 
wrote:

> i  want to  parse   only  error ,info,  warn data  without  provinance
>
>
>
> --
> Sent from: http://apache-nifi-users-list.2361937.n4.nabble.com/
>


Is this possible to access shared network files in NiFi?

2017-09-05 Thread prabhu Mahendran
i wants to process shared drive data into NiFi.

I have shared file in which present in connected network.It prompts for
credentials before access that file.

Now i have to fetch that file from shared network(//hostname/shared/file)
with use of credentials(username/pwd) and  then process those file into NiFi

Can anyone suggest me way to download file from shared drive with
credentials into NiFi?


How to combine the row with column wise in nifi?

2017-08-07 Thread prabhu Mahendran
Am having below sample data.

,r1c2,r1c3,r1c4,r1c5
Date,r2c2,r2c3,r2c4,r2c5
07-08-2017,r3c2,r3c3,r3c4,r3c5

i need to convert those data like below.

07-08-2017,r1c2,r2c2,r3c2
07-08-2017,r1c3,r2c3,r3c3
07-08-2017,r1c4,r2c4,r3c4
07-08-2017,r1c5,r2c5,r3c5

i have tried those with nifi looping concept.It fails while stores overall
input in memory leads performance slower.

i have asked that into my best friend(google) then it shows some
ExecuteScript processor available in nifi for processing my requirement.

Am new to that jython/groovy script but i have tried following script but
end result not like my expected result.

https://stackoverflow.com/questions/10507104/how-to-do-row-to-column-transposition-of-data-in-csv-table/10507199#10507199

i need to convert those data with respect to my requirement in NiFi without
using any external dependencies.

Can anyone guide me way to achieve my requirement?


Re: MaxHistoryNot Worked in logback-1.2.3?

2017-07-19 Thread prabhu Mahendran
Hi Aldrin,

Thanks for your response.

i wants to keep only 5 files each file should having  size 100KB. If total
size of the files reaches 500KB then it will delete oldest file and then
write new file.

I have tried SizeAndTimeBasedPolicy then it couldn't preferable for me to
use with respect to my requirement.

According to that need which rolling policy would you recommend me to use
in NiFi?

Thanks,
Mahendran


MaxHistoryNot Worked in logback-1.2.3?

2017-07-17 Thread prabhu Mahendran
I have tried below code to write logs based on SizeAndTimeBasedRollingPolicy
.

In that i can able to write files with respect to file size 100KB but not
able to maintain only 2 maxHistory files.It always write new file after
100KB successfully written.

Now i have using NiFI-1.3.0 which having logback version 1.2.3 for core and
classic jars.







${org.apache.nifi.bootstrap.config.log.dir}/nifi-app_%d{-MM-dd}.%i.log

100 KB



   2

   true







%date %level [%thread] %logger{40} %msg%n







I have faced like issue that may belongs to below task.

https://jira.qos.ch/browse/LOGBACK-162

I have used NiFi-1.3.0 it only uses logback 1.2.3.

In NiFi i have found one jira for upgrade logback.

https://issues.apache.org/jira/browse/NIFI-3699

I think logback latest version is 1.2.3. Here it seems MaxHistory property
doesn't work.

Am i missing anythink?

Can anyone suggest me to solve this?


Re: how to use LookUpRecord processor?

2017-07-10 Thread prabhu Mahendran
Thank you for your suggestions☺
It worked
On 10-Jul-2017 7:26 PM, "Bryan Bende" <bbe...@gmail.com> wrote:

> I think FreeFormTextRecordSetWriter can only access fields from the
> read schema, so you are correct that this would be a problem since
> "Sex" wasn't in the read schema.
>
> You could change your CSVReader to have 'Schema Access Strategy" ->
> "Use Schema Name" and then create an AvroSchemaRegistry and define a
> schema with the "Sex" field like:
>
> {
>   "name": "custom",
>   "namespace": "nifi",
>   "type": "record",
>   "fields": [
> { "name": "No", "type": "int" },
> { "name": "Name", "type": "string" },
> { "name": "ID", "type": "int" },
> { "name": "Age", "type": "int" },
> { "name": "Sex", "type": ["string", "null"] },
>   ]
> }
>
> Make sure your flow file has an attribute "schema.name" with the name
> that you used when you added the above schema to the
> AvroSchemaRegistry.
>
>
>
> On Mon, Jul 10, 2017 at 1:16 AM, prabhu Mahendran
> <prabhuu161...@gmail.com> wrote:
> > Hi bryan,
> >
> > Thanks for your suggestion.
> >
> > i have followed your steps and am have an one doubt regarding your schema
> > creation for "Sex" column.
> >
> > i haven't using schema in created flow.
> >
> > You have said that flow have a field called "Sex" in the schema being
> used
> > by the record writer.i'm FreeFormSetWriter only not "CSVRecordWriter"
> >
> > In that CSVReader i have configured "Schema Access Strategy"-->"Use
> String
> > fields from Header".
> >
> > FreeFormSetWriter,i have specified the ${ID},${Name},${Sex} only.
> >
> > i wants you to know when i have to create schema for "Sex".
> >
> > Please stop me if am understands anything wrong.
> >
> > Can you please guide me to create schema for my requirement?
> >
> > Thanks,
> >  prabhu
> >
> > On Fri, Jul 7, 2017 at 5:05 PM, Bryan Bende <bbe...@gmail.com> wrote:
> >>
> >> Hi Prabhu,
> >>
> >> The SimpleCsvFileLookupService is meant to look up one value and add
> >> it back to the records.
> >>
> >> So lets say you want to lookup the gender and add it to the original
> >> records...
> >>
> >> You would configure SimpleCsvFileLookupService with the following:
> >>
> >> - Lookup Key Column = ID
> >> - Lookup Key Value = Sex
> >>
> >> When the service starts it will then make a map of ID to Sex so you
> would
> >> have:
> >>
> >> 2201 -> Male
> >> 3300 -> Female
> >>
> >> Now in LookupRecord you would add a user-defined property of "key" =
> >> "ID" since ID is the column from the incoming records that would to
> >> use as the key into the above map.
> >>
> >> Then "Result Record Path" should be the field in the records where you
> >> want the result of the look up go to, so you would want something like
> >> "/Sex".
> >>
> >> You'll also need to have a field called "Sex" in the schema being used
> >> by the record writer. You could make one schema that has a nullable
> >> Sex field and have the CsvReader and FreeFormTextWriter both reference
> >> that schema, or you could let the CsvReader infer the schema from the
> >> fields (it won't have a sex field) and then use a different schema for
> >> the writer.
> >>
> >> -Bryan
> >>
> >>
> >> On Fri, Jul 7, 2017 at 6:28 AM, prabhu Mahendran
> >> <prabhuu161...@gmail.com> wrote:
> >> > I tried to join two csv file based on id with respect to the below
> >> > reference.
> >> >
> >> >
> >> > How to join two CSVs with Apache Nifi
> >> >
> >> >
> >> > i'm using NiFi-1.3.0
> >> >
> >> >
> >> > Now i have two csv files.
> >> >
> >> >
> >> > 1.custom.csv
> >> >
> >> >
> >> > No,Name,ID,Age
> >> >
> >> > 1,Hik,2201,33
> >> >
> >> > 2,Kio,3300,22
> >> >
&g

how to use LookUpRecord processor?

2017-07-07 Thread prabhu Mahendran
I tried to join two csv file based on id with respect to the below
reference.
How to join two CSVs with Apache Nifi
i'm using NiFi-1.3.0
Now i have two csv files.
1.custom.csv
No,Name,ID,Age1,Hik,2201,332,Kio,3300,22

2.gender.csv
ID,Name,Sex2201,Hik,Male3300,Kio,FemaleI try to combine those tables with
"ID" like following endresult.
No,Name,Sex,ID,Age1,Hik,Male,2201,332,Kio,Female,3300,22

I have using following processor structure.
GetFile-SplitText-ExtractText-LookUpRecord-PutFileIn that lookup record i
have configured
RecordReader = "CSVReader"RecordWriter="FreeFormTextRecordSetWriter"
LookUpService="SimpleCSVFileLookUpService"
ResultRecordPath-->/Keykey-->/IDIn that LookUpService i have given path of
the "gender.csv" and setted LookUpKeyColumn and LookUpValueColumn to be
"ID".
In that FreeFormTextRecordSetWriter i have given text
value"${No},${Name},${ID},${Age},${Sex}"
It yields below result only.
No,Name,Sex,ID,Age,1,Hik,Male,2201,33,2,Kio,Female,3300,22,It doesn't have
"sex" column.
I think i could not configured correctly.
i don't know how to use ResultRecordPath & one dynamic attribute(Key)
specification in LookUpRecord?
Can anyone guide me to solve my issue?


How to pass the username while download file from Web Url?

2017-06-27 Thread prabhu Mahendran
I have secured downloadable link to download the file. So configured the
link in InvokeHTTP processor with StandardSSLContextService with the
truststore certificate. In response, I get the HTML tags i.e., page source
of the link.
But using RestClient opensource if I logged into the link, data is
downloading properly. If the link is not logged into the link, page source
html tag is the result.
I think the similar behavior achieved in InvokeHTTP? How to resolve this to
accept the credentials of the link? I have configured the username and
password in InvokeHTTP but it fails.


Re: How to update line with modified data in Jython?

2017-06-20 Thread prabhu Mahendran
Thank you matt for this response

Yeah it worked
On 20-Jun-2017 7:55 AM, "Matt Burgess" <mattyb...@apache.org> wrote:

Prabhu,

I'm no Python/Jython master by any means, so I'm sure there's a better
way to do this than what I came up with. Along the way I noticed some
things about the input data and Jython vs Python:

1) Your "for line in text[1:]:" is skipping the first line, I assume
in the "real" data there is a header?
2) The second row of data refers to a leap day (Feb 29) which did not
exist in 2015 so it throws an exception. I changed all the months to
03 and kept going
3) Your third row doesn't have any fractional seconds, is this on
purpose? I assumed so and tried to provide for that
4) Jython (and Python 2) don't support the %z directive in datetime
formats, and %Z refers to a String like a City or Country in that
timezone or the friendly name of the timezone, not the +-HHMM value.
Also in your data you include only the hour offset, not minutes

I came up with a fairly fragile script that seems to work given your input:

import datetime
import json
import java.io
from org.apache.commons.io import IOUtils
from java.nio.charset import StandardCharsets
from org.apache.nifi.processor.io import StreamCallback

class PyStreamCallback(StreamCallback):
  logger = None
  def __init__(self, log):
logger = log
pass
  def process(self, inputStream, outputStream):
text = IOUtils.readLines(inputStream, StandardCharsets.UTF_8)
for line in text[1:]:
cols = line.split(",")
df = "%d-%m-%Y %H:%M:%S.%f"
trunc_3 = True
try:
   d2 = datetime.datetime.strptime(cols[3][:-3],df)
except ValueError:
   df = "%d-%m-%Y %H:%M:%S"
   trunc_3 = False
   d2 = datetime.datetime.strptime(cols[3][:-3],df)
if trunc_3:
   cols[3] = d2.strftime(df)[:-3]
else:
   cols[3] = d2.strftime(df)
outputStream.write(",".join(cols) + "\n")

flowFile = session.get()
if (flowFile != None):
  flowFile = session.write(flowFile,PyStreamCallback(log))
  flowFile = session.putAttribute(flowFile, "filename",
flowFile.getAttribute('filename'))
  session.transfer(flowFile, REL_SUCCESS)


Please let me know if I've misunderstood anything, and I will try to
fix/improve the script.

Regards,
Matt

On Mon, Jun 19, 2017 at 8:31 AM, prabhu Mahendran
<prabhuu161...@gmail.com> wrote:
> I'm having one csv which contains lakhs of rows and below is sample
lines..,
>
> 1,Ni,23,28-02-2015 12:22:33.2212-02
> 2,Fi,21,29-02-2015 12:22:34.3212-02
> 3,Us,33,30-03-2015 12:23:35-01
> 4,Uk,34,31-03-2015 12:24:36.332211-02
> I need to get the last column of csv data which is in wrong datetime
format.
> So I need to get default datetimeformat("-MM-DD hh:mm:ss[.nnn]") from
> last column of the data.
>
> I have tried the following script to get lines from it and write into flow
> file.
>
> import json
> import java.io
> from org.apache.commons.io import IOUtils
> from java.nio.charset import StandardCharsets
> from org.apache.nifi.processor.io import StreamCallback
>
> class PyStreamCallback(StreamCallback):
>   def __init__(self):
> pass
>   def process(self, inputStream, outputStream):
> text = IOUtils.readLines(inputStream, StandardCharsets.UTF_8)
> for line in text[1:]:
> outputStream.write(line + "\n")
>
> flowFile = session.get()
> if (flowFile != None):
>   flowFile = session.write(flowFile,PyStreamCallback())
>   flowFile = session.putAttribute(flowFile, "filename",
> flowFile.getAttribute('filename'))
>   session.transfer(flowFile, REL_SUCCESS)
> but I am not able to find a way to convert it like below output.
>
> 1,Ni,23,28-02-2015 12:22:33.221
> 2,Fi,21,29-02-2015 12:22:34.321
> 3,Us,33,30-03-2015 12:23:35
> 4,Uk,34,31-03-2015 12:24:36.332
> I have checked those requirement with my friend(google) and still not able
> to find solution.
>
> Can anyone guide me to convert those input data into my required output?


How to update line with modified data in Jython?

2017-06-19 Thread prabhu Mahendran
I'm having one csv which contains lakhs of rows and below is sample lines..,

1,Ni,23,28-02-2015 12:22:33.2212-02
2,Fi,21,29-02-2015 12:22:34.3212-02
3,Us,33,30-03-2015 12:23:35-01
4,Uk,34,31-03-2015 12:24:36.332211-02
I need to get the last column of csv data which is in wrong datetime
format. So I need to get default datetimeformat("-MM-DD
hh:mm:ss[.nnn]") from last column of the data.

I have tried the following script to get lines from it and write into flow
file.

import json
import java.io
from org.apache.commons.io import IOUtils
from java.nio.charset import StandardCharsets
from org.apache.nifi.processor.io import StreamCallback

class PyStreamCallback(StreamCallback):
  def __init__(self):
pass
  def process(self, inputStream, outputStream):
text = IOUtils.readLines(inputStream, StandardCharsets.UTF_8)
for line in text[1:]:
outputStream.write(line + "\n")

flowFile = session.get()
if (flowFile != None):
  flowFile = session.write(flowFile,PyStreamCallback())
  flowFile = session.putAttribute(flowFile, "filename",
flowFile.getAttribute('filename'))
  session.transfer(flowFile, REL_SUCCESS)
but I am not able to find a way to convert it like below output.

1,Ni,23,28-02-2015 12:22:33.221
2,Fi,21,29-02-2015 12:22:34.321
3,Us,33,30-03-2015 12:23:35
4,Uk,34,31-03-2015 12:24:36.332
I have checked those requirement with my friend(google) and still not able
to find solution.

Can anyone guide me to convert those input data into my required output?


Re: How to transfer files between two windows machines using NiFi?

2017-06-14 Thread prabhu Mahendran
i wants to know how to configure the executestream command for the below
command.

copy C:\input\ip.txt \\host2\C:\destFolder\ip.txt
Actually if i open cmd prompt(AnyPath) then write this command it could
worked in windows.

But i need to process those command in NiFi.

I tried those command in following attributes like below.

Command Arguments:copy C:\input\ip.txt \\host2\C:\destFolder\ip.txt
Command Path:C:\Windows\system32\cmd.exe
Argument Delimiter: space
Here after success of executestream command processed in OutputStream
relationship but functionality of the command not issued.

In that command i have copy the file(ip.txt) from host1 into host2
machine.Generally if i run that command in cmd.exe then file copied into
host2.

But if configure those parameters in ExecuteStreamcommand i have received
outputstream but my command not run and file not moved into host2.
Can you suggest me way to solve this?

On Wed, Jun 14, 2017 at 5:17 PM, Mark Payne <marka...@hotmail.com> wrote:

> Prabhu,
>
> The recommended approach would be to use Site-to-Site [1] to transfer the
> data between
> the two NiFi instances.
>
> Thanks
> -Mark
>
> [1] http://nifi.apache.org/docs/nifi-docs/html/user-
> guide.html#site-to-site
>
>
>
> On Jun 14, 2017, at 2:53 AM, prabhu Mahendran <prabhuu161...@gmail.com>
> wrote:
>
> Hi All,
>
>
> i know that using "RoboCopy" and "Copy" to move the files between two
> machines.
>
> I need to know how to configure those commands ExecuteStreamCommand
> processor in NiFi.
>
> Is there is any other option available without using
> ExecuteStream/ExecuteProcess processors?
>
> Thanks,
>
>
>


How to transfer files between two windows machines using NiFi?

2017-06-14 Thread prabhu Mahendran
Hi All,


i know that using "RoboCopy" and "Copy" to move the files between two
machines.

I need to know how to configure those commands ExecuteStreamCommand
processor in NiFi.

Is there is any other option available without using
ExecuteStream/ExecuteProcess processors?

Thanks,


Re: How to ensure the rows moved into SQL?

2017-06-12 Thread prabhu Mahendran
Thanks for your response.

i am having 10 files vary between number of rows.

For example one file has contains 30 lakhs rows and one having 70 lakhs of
rows.Since i have using two split text for split files with Line Split
Count.hence "fragment.count" not worked in it.

i have used nifi in windows so "wc -l" count not worked.

Is there any other way to ensure number of rows?
On 12-Jun-2017 11:20 PM, "Andy LoPresto" <alopre...@apache.org> wrote:

> Prabhu,
>
> You can get a row count on the incoming CSV files by routing them through
> SplitText and using the “fragment.count” value as the total number of
> (non-header) lines, or by using an ExecuteStreamCommand with the command
> “wc -l” which counts the number of lines in text. With this knowledge, you
> can then use a SQL query in ExecuteSQL to check the number of rows inserted
> in the last time slice or with a special identifier range by checking
> “max(id)” before the insert begins.
>
>
> Andy LoPresto
> alopre...@apache.org
> *alopresto.apa...@gmail.com <alopresto.apa...@gmail.com>*
> PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69
>
> On Jun 12, 2017, at 5:47 AM, prabhu Mahendran <prabhuu161...@gmail.com>
> wrote:
>
> Hi All,
>
> Since i need to know how to check all rows in csv moved in SQL Server.
>
> I have download csv files from HTTP.
>
> Just imagine i have 10 files i could move files one after another into SQL
> Server.
>
> i need to ensure if number of rows in csv moves correctly insert into SQL
> Server.
>
> For example if file having 100 rows to be insert into SQL Server how to
> check 100 rows correctly moved into SQL in NiFi.After move that 100 files
> then row next file moved into SQL.
>
> Is this possible to check number of rows in files equal to sql moved rows
> without using Script?
>
>
>


How to ensure the rows moved into SQL?

2017-06-12 Thread prabhu Mahendran
Hi All,

Since i need to know how to check all rows in csv moved in SQL Server.

I have download csv files from HTTP.

Just imagine i have 10 files i could move files one after another into SQL
Server.

i need to ensure if number of rows in csv moves correctly insert into SQL
Server.

For example if file having 100 rows to be insert into SQL Server how to
check 100 rows correctly moved into SQL in NiFi.After move that 100 files
then row next file moved into SQL.

Is this possible to check number of rows in files equal to sql moved rows
without using Script?


Re: How to perform bulk insert into SQLServer from one machine to another?

2017-06-08 Thread prabhu Mahendran
Matt,

Thanks for your wonderful response

I think create FTP server is best way for me to move input file into sql
and runs a query.

Can you please suggest way
to create FTP server in Sql installed machine using NIFI?

Many thanks,
Prabhu
On 08-Jun-2017 6:27 PM, "Matt Burgess" <mattyb...@gmail.com> wrote:

Prabhu,

>From [1], the data file "must specify a valid path from the server on
which SQL Server is running. If data_file is a remote file, specify
the Universal Naming Convention (UNC) name. A UNC name has the form
\\Systemname\ShareName\Path\FileName. For example,
\\SystemX\DiskZ\Sales\update.txt".  Can you expose the CSV file via a
network drive/location?  If not, can you place the file on the SQL
Server using NiFi?  For example, if there were an FTP server running
on the SQL Server instance, you could use the PutFTP processor, then
PutSQL after that to issue your BULK INSERT statement.

Regards,
Matt

[1] https://docs.microsoft.com/en-us/sql/t-sql/statements/bulk-
insert-transact-sql

On Thu, Jun 8, 2017 at 8:11 AM, prabhu Mahendran
<prabhuu161...@gmail.com> wrote:
> i have running nifi instance in one machine and have SQL Server in another
> machine.
>
> Here i can try to perform bulk insert operation with bulk insert Query in
> SQLserver. but i cannot able insert data from one machine and move it into
> SQL Server in another machine.
>
> If i run nifi and SQL Server in same machine then i can able to perform
bulk
> insert operation easily.
>
> i have configured GetFile->ReplaceText(BulkInsertQuery)-->PutSQL
processors.
>
> I have tried both nifi and sql server in single machine then bulk insert
> works but not works when both instances in different machines.
>
> I need to get all data's from one machine and write a query to move that
> data into SQL runs in another machine.
>
> Below query works when nifi and sql server in same machine
>
> BULK INSERT BI FROM 'C:\Directory\input.csv' WITH (FIRSTROW = 1,
> ROWTERMINATOR = '\n', FIELDTERMINATOR = ',', ROWS_PER_BATCH = 1)
> if i run that query in another machine then it says..,"FileNotFoundError"
> due to "input.csv" in Host1 machine but runs query in sql server machine
> (host2)
>
> Can anyone give me suggestion to do this?


How to perform bulk insert into SQLServer from one machine to another?

2017-06-08 Thread prabhu Mahendran
i have running nifi instance in one machine and have SQL Server in another
machine.

Here i can try to perform bulk insert operation with bulk insert Query in
SQLserver. but i cannot able insert data from one machine and move it into
SQL Server in another machine.

If i run nifi and SQL Server in same machine then i can able to perform
bulk insert operation easily.

i have configured GetFile->ReplaceText(BulkInsertQuery)-->PutSQL processors.

I have tried both nifi and sql server in single machine then bulk insert
works but not works when both instances in different machines.

I need to get all data's from one machine and write a query to move that
data into SQL runs in another machine.

Below query works when nifi and sql server in same machine

BULK INSERT BI FROM 'C:\Directory\input.csv' WITH (FIRSTROW = 1,
ROWTERMINATOR = '\n', FIELDTERMINATOR = ',', ROWS_PER_BATCH = 1)
if i run that query in another machine then it says..,"FileNotFoundError"
due to "input.csv" in Host1 machine but runs query in sql server machine
(host2)

Can anyone give me suggestion to do this?


How to find week of the year for the date value?

2017-05-26 Thread prabhu Mahendran
My Flow having attribute '05-05-2015'.And i need to find week of the date
in NiFi.

For example:

if date is 05 then it is belongs to 1st week of the year.

I shouldn't use Script/Program for find out week.

i need to find out that using NiFi Expression language only.

Can anyone suggest way to do that?


Re: some processors runs only once in NiFi

2017-05-26 Thread prabhu Mahendran
Hi Koji,

Thanks for your explanation

Many thanks
prabhu

On Fri, May 26, 2017 at 9:40 AM, Koji Kawamura <ijokaruma...@gmail.com>
wrote:

> Hi Prabhu,
>
> Same as ListHDFS, GetHTTP uses ETAG HTTP header, and if server returns
> NOT_MODIFIED(304), it doesn't create output FlowFile. The screenshot
> indicates that GetHTTP runs 61 times but it only creates output
> FlowFile once because it's not modified.
>
> I believe that is what's happening.
>
> Thanks,
> Koji
>
> On Wed, May 24, 2017 at 2:30 PM, prabhu Mahendran
> <prabhuu161...@gmail.com> wrote:
> > Pierre,
> >
> > Thanks for your mail,
> >
> > I might try to list files over and over.So that may be problem i faced.I
> > just modified existing files in hdfs and then list those files using
> > ListHDFS.
> >
> > I could be list files in which same as well as last execution of a
> processor
> > that's may be problem.
> >
> > Many thanks
> >
> >
> > On Wed, May 24, 2017 at 10:45 AM, Pierre Villard
> > <pierre.villard...@gmail.com> wrote:
> >>
> >> Just a quick remark, the ListHDFS processor won't list files over and
> >> over, it'll only list new files since the last execution of the
> processor.
> >> Do you know if new files are generated in the directory your are
> listing?
> >>
> >> Screenshots of your configurations would definitely help.
> >>
> >> 2017-05-24 6:55 GMT+02:00 Joe Witt <joe.w...@gmail.com>:
> >>>
> >>> prabhu - can you please share screenshots and or logs showing that it
> >>> is running only once?
> >>>
> >>> Thanks
> >>>
> >>> On Wed, May 24, 2017 at 12:42 AM, prabhu Mahendran
> >>> <prabhuu161...@gmail.com> wrote:
> >>> > Aldrin,
> >>> >
> >>> > Thanks for your response.
> >>> >
> >>> > For GetHTTP ,I have checked to download different files even it could
> >>> > not
> >>> > run more than once.
> >>> >
> >>> > ListHDFS:I have used NiFi-1.2.0 in which configured these attributes
> >>> > "Hadoop
> >>> > Configuration Resources","Directory","RecurseSubDirectories"
> correctly
> >>> > for
> >>> > Hadoop-2.5.2.This runs only once not run again.
> >>> >
> >>> > Note: I have checked those processors in windows.
> >>> >
> >>> > Can you give any suggestion to solve this?
> >>> >
> >>> > Many Thanks,
> >>> >
> >>> >
> >>> > On Tue, May 23, 2017 at 7:03 PM, Aldrin Piri <aldrinp...@gmail.com>
> >>> > wrote:
> >>> >>
> >>> >> For GetHTTP, this processor makes use of ETags[1] to prevent
> >>> >> downloading
> >>> >> the same resource repeatedly.  I would speculate that this is the
> case
> >>> >> for
> >>> >> the resource you are specifying.
> >>> >>
> >>> >> As for ListHDFS, could you specify what version you are using?
> There
> >>> >> have
> >>> >> been some bugs concerning how this was handled.  If the version is
> the
> >>> >> latest, could you please provide some more details in terms of
> >>> >> structure and
> >>> >> timestamps of the associated files causing the issue you are
> >>> >> describing?
> >>> >>
> >>> >>
> >>> >> [1] https://en.wikipedia.org/wiki/HTTP_ETag
> >>> >>
> >>> >> On Tue, May 23, 2017 at 3:22 AM, prabhu Mahendran
> >>> >> <prabhuu161...@gmail.com> wrote:
> >>> >>>
> >>> >>> Since i have faced some unexpected behaviour's in NiFi.
> >>> >>>
> >>> >>> I don't know why those processors which doesn't run more than once.
> >>> >>>
> >>> >>>
> >>> >>> For example:
> >>> >>>
> >>> >>> 1.GetHTTP:
> >>> >>>
> >>> >>> I have used GetHTTP processor for download files from "HTTP" Url.
> >>> >>> Initially i have scheduled 0 sec
> >>> >>>
> >>> >>> If i runs the processor it runs only once and not again run.Once
> copy
> >>> >>> the
> >>> >>> same processor and paste in the UI then click run that processor it
> >>> >>> again
> >>> >>> runs only once.
> >>> >>>
> >>> >>> If i scheduling it then also not runs more than once.
> >>> >>>
> >>> >>>
> >>> >>>
> >>> >>> 2.ListHDFS:
> >>> >>>
> >>> >>> I have configured local cluster properties in ListHDFS.
> >>> >>>
> >>> >>> i have 12 files in hdfs directory.If i runs without scheduling then
> >>> >>> it
> >>> >>> lists 12 files correctly and after scheduling it only returns 11
> >>> >>> files
> >>> >>> without 1 file and not run after first time run
> >>> >>>
> >>> >>>
> >>> >>>
> >>> >>>
> >>> >>> can anyone explain the behaviour of those processsors when 1 day
> >>> >>> scheduling in TimerDriven?
> >>> >>
> >>> >>
> >>> >
> >>
> >>
> >
>


Re: some processors runs only once in NiFi

2017-05-23 Thread prabhu Mahendran
Aldrin,

Thanks for your response.

For GetHTTP ,I have checked to download different files even it could not
run more than once.

ListHDFS:I have used NiFi-1.2.0 in which configured these attributes
"Hadoop Configuration Resources","Directory","RecurseSubDirectories"
correctly for Hadoop-2.5.2.This runs only once not run again.

*Note: *I have checked those processors in windows*.*

Can you give any suggestion to solve this?

Many Thanks,


On Tue, May 23, 2017 at 7:03 PM, Aldrin Piri <aldrinp...@gmail.com> wrote:

> For GetHTTP, this processor makes use of ETags[1] to prevent downloading
> the same resource repeatedly.  I would speculate that this is the case for
> the resource you are specifying.
>
> As for ListHDFS, could you specify what version you are using?  There have
> been some bugs concerning how this was handled.  If the version is the
> latest, could you please provide some more details in terms of structure
> and timestamps of the associated files causing the issue you are describing?
>
>
> [1] https://en.wikipedia.org/wiki/HTTP_ETag
>
> On Tue, May 23, 2017 at 3:22 AM, prabhu Mahendran <prabhuu161...@gmail.com
> > wrote:
>
>> Since i have faced some unexpected behaviour's in NiFi.
>>
>> I don't know why those processors which doesn't run more than once.
>>
>>
>>
>> *For example:*
>> *1.GetHTTP:*
>>
>> I have used GetHTTP processor for download files from "HTTP" Url.
>> Initially i have scheduled 0 sec
>>
>> If i runs the processor it runs only once and not again run.Once copy the
>> same processor and paste in the UI then click run that processor it again
>> runs only once.
>>
>> If i scheduling it then also not runs more than once.
>>
>>
>>
>> *2.ListHDFS:*
>>
>> I have configured local cluster properties in ListHDFS.
>>
>> i have 12 files in hdfs directory.If i runs without scheduling then it
>> lists 12 files correctly and after scheduling it only returns 11 files
>> without 1 file and not run after first time run
>>
>>
>>
>>
>> can anyone explain the behaviour of those processsors when 1 day
>> scheduling in TimerDriven?
>>
>
>


some processors runs only once in NiFi

2017-05-23 Thread prabhu Mahendran
Since i have faced some unexpected behaviour's in NiFi.

I don't know why those processors which doesn't run more than once.



*For example:*
*1.GetHTTP:*

I have used GetHTTP processor for download files from "HTTP" Url. Initially
i have scheduled 0 sec

If i runs the processor it runs only once and not again run.Once copy the
same processor and paste in the UI then click run that processor it again
runs only once.

If i scheduling it then also not runs more than once.



*2.ListHDFS:*

I have configured local cluster properties in ListHDFS.

i have 12 files in hdfs directory.If i runs without scheduling then it
lists 12 files correctly and after scheduling it only returns 11 files
without 1 file and not run after first time run




can anyone explain the behaviour of those processsors when 1 day scheduling
in TimerDriven?


Re: How to process files sequentially?

2017-05-19 Thread prabhu Mahendran
Sorry Koji,

I have used NiFi in Windows So i can't able to use ExecuteStreamCommand for
it.

It shows error in "System cannot find path Specified"

Is any other way for doing it in windows?

On Fri, May 19, 2017 at 12:50 PM, Koji Kawamura <ijokaruma...@gmail.com>
wrote:

> Hi Prabhu,
>
> I just used MergeContent to confirm test result. In your case, I
> thought the goal is sending queries to SQL Server in order so I think
> you don't have to use MergeContent.
>
> Having said that, I came up with an idea. This is kind of a hack but
> using ExecuteStreamingCommand to analyze number of files in a dir and
> use MergeContent 'Defragment' mode, I was able to merge content
> without using static minimum number of files.
>
> Here is another template for that example:
> https://gist.githubusercontent.com/ijokarumawak/
> 7e6158460cfcb0b5911acefbb455edf0/raw/967a051177878a98f5ddb57653478c
> 6091a7b23c/process-files-in-order-and-defrag.xml
>
> Thanks,
> Koji
>
> On Fri, May 19, 2017 at 6:11 PM, prabhu Mahendran
> <prabhuu161...@gmail.com> wrote:
> > Hi Koji,
> >
> > Thanks for your mail.
> >
> > In your template i have one query regarding if number of files taken by
> get
> > file is unknown then MergeContent Processor could not work right? because
> > you have specify maximum number of bins to be 5.
> >
> > But in my case i am having dynamic number of file counts.In that case
> merge
> > content will failed to merge.
> >
> > Please stop me if i'm understand anything wrong.
> >
> > How to give dynamic number of entries/bin for MergeContent due to
> currently
> > there is no expression language supported?
> >
> > On Fri, May 19, 2017 at 10:34 AM, Koji Kawamura <ijokaruma...@gmail.com>
> > wrote:
> >>
> >> Hi Prabhu,
> >>
> >> I think you can use EnforceOrder processor which is available since
> >> 1.2.0, without Wait/Notify processor.
> >>
> >> Here is a sample flow I tested how it can be used for use-cases like
> >> yours:
> >> https://gist.github.com/ijokarumawak/7e6158460cfcb0b5911acefbb455edf0
> >>
> >> Thanks,
> >> Koji
> >>
> >> On Fri, May 19, 2017 at 1:52 PM, prabhu Mahendran
> >> <prabhuu161...@gmail.com> wrote:
> >> > I have approximately 1000 files in local drive.I need to move that
> files
> >> > into SQL Server accordingly one after another.
> >> >
> >> > Since local drive having files like file1.csv,file2.csv,..upto
> >> > file1000.csv.I am sure that number of files in local drive may change
> >> > dynamically.
> >> >
> >> > I can able to created template for move that files into SQL
> Server.But i
> >> > have to process the file2 when file 1 has been completely moved into
> SQL
> >> > Server.
> >> >
> >> > Is this possible in NiFi without using Wait\Notify processor?
> >> >
> >> > can anyone please guide me to solve this?
> >
> >
>


Re: How to process files sequentially?

2017-05-19 Thread prabhu Mahendran
Hi Koji,

Thanks for your mail.

In your template i have one query regarding if number of files taken by get
file is unknown then MergeContent Processor could not work right? because
you have specify maximum number of bins to be 5.

But in my case i am having dynamic number of file counts.In that case merge
content will failed to merge.

Please stop me if i'm understand anything wrong.

How to give dynamic number of entries/bin for MergeContent due to currently
there is no expression language supported?

On Fri, May 19, 2017 at 10:34 AM, Koji Kawamura <ijokaruma...@gmail.com>
wrote:

> Hi Prabhu,
>
> I think you can use EnforceOrder processor which is available since
> 1.2.0, without Wait/Notify processor.
>
> Here is a sample flow I tested how it can be used for use-cases like yours:
> https://gist.github.com/ijokarumawak/7e6158460cfcb0b5911acefbb455edf0
>
> Thanks,
> Koji
>
> On Fri, May 19, 2017 at 1:52 PM, prabhu Mahendran
> <prabhuu161...@gmail.com> wrote:
> > I have approximately 1000 files in local drive.I need to move that files
> > into SQL Server accordingly one after another.
> >
> > Since local drive having files like file1.csv,file2.csv,..upto
> > file1000.csv.I am sure that number of files in local drive may change
> > dynamically.
> >
> > I can able to created template for move that files into SQL Server.But i
> > have to process the file2 when file 1 has been completely moved into SQL
> > Server.
> >
> > Is this possible in NiFi without using Wait\Notify processor?
> >
> > can anyone please guide me to solve this?
>


Re: How to lock getfile upto putfile write into same file?

2017-05-18 Thread prabhu Mahendran
I have running multiple instances of PutFile writing to same file that
leads fail in Strange ways.

I have tried MergeContent processor but it not yields same results for all
time.

For one time it correctly merges files according to the filename but
sometimes it could not merge.

In my use case i have to store the contents in 10 files(number may be
changed).A filename is specified in every flow files.

I couldn't  merge the files with same filename in some times.And also i
cannot use expression language in "Minimum Number of entries" attribute if
i have set that property directly it can work.



On Thu, May 18, 2017 at 2:09 PM, Joe Witt <joe.w...@gmail.com> wrote:

> PutFile while writing automatically writes with a dot prepended and
> once the write is complete it removes the dot.
>
> If you have multiple instances of PutFile writing to the same file
> with the intent of appending to that file it will like fail in strange
> ways.
>
> I would recommend you simply build up the complete thing you want to
> write out to a file using MergeContent and send the merged results to
> PutFile.  However, it should not be the case that you have PutFile and
> GetFile sharing a given file in the same NiFi as you can simply
> connect the processors to move the data without have to leave the
> confines of NiFi using file IO.
>
> Thanks
>
> On Thu, May 18, 2017 at 3:06 AM, prabhu Mahendran
> <prabhuu161...@gmail.com> wrote:
> > Ok. I will append the '.' before Putfile using UpdateAttribute. Is this
> > name(without '.') will automatically changed by Putfile when its done?
> >
> >
> >
> > Since I am appending each rows from html content into proper csv using
> > Putfile processor. Is this works fine? How putfile knows complete html
> > flowfiles has been moved.
> >
> >
> > On Tue, May 16, 2017 at 7:53 PM, Juan Sequeiros <helloj...@gmail.com>
> wrote:
> >>
> >> Prabhu,
> >>
> >> PutFile should pre-pend files with a "." while it is writing to it and
> >> then in the end will copy the "." file to its "filename" value.
> >>
> >> On GetFile side make sure that "ignore hidden files" is set to true.
> >>
> >>
> >>
> >> On Tue, May 16, 2017 at 1:29 AM prabhu Mahendran <
> prabhuu161...@gmail.com>
> >> wrote:
> >>>
> >>> I have scheduled getfile processor to 0 sec to track the local folder.
> >>>
> >>> Issue I have faced: PutFile is appending few flowfiles into single
> file.
> >>> Getfile has been configured with keepsourcefile as false. So getfile is
> >>> fetching partial content before putfile writes into local location.
> >>>
> >>> How to handle this issue? Can we lock the file till putfile/custom
> >>> processor completely writes the file and remove lock once completed??
> >
> >
>


How to process files sequentially?

2017-05-18 Thread prabhu Mahendran
I have approximately 1000 files in local drive.I need to move that files
into SQL Server accordingly one after another.

Since local drive having files like file1.csv,file2.csv,..upto file1000.csv.I
am sure that number of files in local drive may change dynamically.

I can able to created template for move that files into SQL Server.But i
have to process the file2 when file 1 has been completely moved into SQL
Server.

Is this possible in NiFi without using Wait\Notify processor?

can anyone please guide me to solve this?


Re: How to lock getfile upto putfile write into same file?

2017-05-18 Thread prabhu Mahendran
Ok. I will append the '.' before Putfile using UpdateAttribute. Is this
name(without '.') will automatically changed by Putfile when its done?



Since I am appending each rows from html content into proper csv using
Putfile processor. Is this works fine? How putfile knows complete html
flowfiles has been moved.

On Tue, May 16, 2017 at 7:53 PM, Juan Sequeiros <helloj...@gmail.com> wrote:

> Prabhu,
>
> PutFile should pre-pend files with a "." while it is writing to it and
> then in the end will copy the "." file to its "filename" value.
>
> On GetFile side make sure that "ignore hidden files" is set to true.
>
>
>
> On Tue, May 16, 2017 at 1:29 AM prabhu Mahendran <prabhuu161...@gmail.com>
> wrote:
>
>> I have scheduled getfile processor to 0 sec to track the local folder.
>>
>> Issue I have faced: PutFile is appending few flowfiles into single file.
>> Getfile has been configured with keepsourcefile as false. So getfile is
>> fetching partial content before putfile writes into local location.
>>
>> How to handle this issue? Can we lock the file till putfile/custom
>> processor completely writes the file and remove lock once completed??
>>
>


How to lock getfile upto putfile write into same file?

2017-05-15 Thread prabhu Mahendran
I have scheduled getfile processor to 0 sec to track the local folder.

Issue I have faced: PutFile is appending few flowfiles into single file.
Getfile has been configured with keepsourcefile as false. So getfile is
fetching partial content before putfile writes into local location.

How to handle this issue? Can we lock the file till putfile/custom
processor completely writes the file and remove lock once completed??


How to hold on fetch file upto putfile to be complete?

2017-05-15 Thread prabhu Mahendran
Am using FetchFile processor after PutFile. Since I am using Putfile with
'append' as conflict resolution strategy to merge similar lines to
particular file.

By overall, nearly 200 success status sent to fetchfile processor for only
5 completely appended file. This leads to fetch file with deletefile option
leads to file not found for remaining 195 success status.

I want to send success only once for particular filename after grouping
similar flowfiles. Since mergecontent is not logically works, any other
option available to do this option in nifi??


How to use wait\notify Processor?

2017-05-11 Thread prabhu Mahendran
favorite


I have running nifi instance 1.2.0.

I just tried to use "Wait\Notify" Processor with following reference
http://ijokarumawak.github.io/nifi/2017/02/02/nifi-notify-batch/#why-mergecontent-dont-suffice

Drag the template in canvas and i try to running it.

Canvas shows following error in *"Wait/Notify"* processor.

Unable to communicate with cache when processing
StandardFlowFileRecord[uuid=faab337f-034c-4137-a2f3-abb46f22b474,claim=StandardContentClaim
[resourceClaim=StandardResourceClaim[id=1494485406343-1, container=default,
section=1], offset=0,
length=7005603],offset=5280310,name=input.csv,size=1054261] due to
java.net.ConnectException: Connection refused: connect:

I don't know what this error says?

I am not apply any patches in 1.2.0 i just download binary file and tried it

Please stop me if anything if anything i'm missed?

And any one guide me to solve this.


Re: How to get ftp file according to Current date?

2017-04-24 Thread prabhu Mahendran
Hello Pierre,

Thanks for your suggestions.

I have tried one more way for my requirments.

I have used* GenerateFlowFile-->FetchFTP *processor for fetch the file with
current date by using "*RemoteFile*" Attribute (Expression language
supported).

In FetchFTP processor gets data from FTP Server and route it into "
*commons.failure" not into "success" relationship.*
For that issue i have found JIRA Ticket.

*https://issues.apache.org/jira/browse/NIFI-3553
<https://issues.apache.org/jira/browse/NIFI-3553>*
I have applied patches available in that task in NiFi.But it leads
IOException error and also route entire file into *"commons.failure"*
relationship.

ERROR [Timer-Driven Process Thread-4] o.a.nifi.processors.standard.FetchFTP
FetchFTP[id=9ff253e1-015b-1000-4af3-9701cd126b69] Failed to fetch content
for
StandardFlowFileRecord[uuid=1a685b21-b431-4786-a171-1b3772e576d0,claim=,offset=0,name=520303376677591,size=0]
from filename /FTPFolder/20170425InputFile.csv on remote host
ftp.host.domain:21 due to
org.apache.nifi.processor.exception.ProcessException: IOException thrown
from FetchFTP[id=9ff253e1-015b-1000-4af3-9701cd126b69]:
java.net.SocketTimeoutException: Read timed out; routing to comms.failure:
org.apache.nifi.processor.exception.ProcessException: IOException thrown
from FetchFTP[id=9ff253e1-015b-1000-4af3-9701cd126b69]:
java.net.SocketTimeoutException: Read timed out
2017-04-25 09:57:13,812 ERROR [Timer-Driven Process Thread-4]
o.a.nifi.processors.standard.FetchFTP
org.apache.nifi.processor.exception.ProcessException: IOException thrown
from FetchFTP[id=9ff253e1-015b-1000-4af3-9701cd126b69]:
java.net.SocketTimeoutException: Read timed out
at
org.apache.nifi.controller.repository.StandardProcessSession.write(StandardProcessSession.java:2348)
~[na:na]
at
org.apache.nifi.processors.standard.FetchFileTransfer.onTrigger(FetchFileTransfer.java:238)
~[nifi-standard-processors-1.1.1.jar:1.1.1]
at
org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27)
[nifi-api-1.1.1.jar:1.1.1]
at
org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1099)
[nifi-framework-core-1.1.1.jar:1.1.1]
at
org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:136)
[nifi-framework-core-1.1.1.jar:1.1.1]
at
org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:47)
[nifi-framework-core-1.1.1.jar:1.1.1]
at
org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:132)
[nifi-framework-core-1.1.1.jar:1.1.1]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
[na:1.8.0_91]
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
[na:1.8.0_91]
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
[na:1.8.0_91]
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
[na:1.8.0_91]
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
[na:1.8.0_91]
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
[na:1.8.0_91]
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_91]
Caused by: java.net.SocketTimeoutException: Read timed out
at java.net.SocketInputStream.socketRead0(Native Method) ~[na:1.8.0_91]
at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
~[na:1.8.0_91]
at java.net.SocketInputStream.read(SocketInputStream.java:170)
~[na:1.8.0_91]
at java.net.SocketInputStream.read(SocketInputStream.java:141)
~[na:1.8.0_91]

i don't know why this error has shown in processor.

Can you help me regarding this?

Thanks,
Mahendran

On Mon, Apr 24, 2017 at 9:32 PM, Pierre Villard <pierre.villard...@gmail.com
> wrote:

> Hi,
>
> You can file a JIRA and ask for an improvement to add this feature to
> GetFTP and ListFTP, or modify the processor on your side and re-generate
> the NAR containing the processor. This should not be a huge change.
>
> 2017-04-24 9:11 GMT+02:00 prabhu Mahendran <prabhuu161...@gmail.com>:
>
>> Pierre,
>>
>> Thanks for your reply.
>>
>> I have tried your suggestions.
>>
>> i am having 1 lakhs files in FTP Directory.
>>
>> For *ListFTP* It check all filenames and process into queue it could
>> take long time.
>>
>> So it can take more time for fetch particular FTP file from lakhs.
>>
>> Can you guide way to reduce time taken for fetch file?
>>
>> On Fri, Apr 21, 2017 at 8:40 PM, Pierre Villard <
>> pierre.villard...@gmail.com> wrote:
>>
>>> You could use the combination of ListFTP and FetchFTP (this is, most of
>>> the time, a better approach), and between the two processors you could do a
>>> Rout

Re: How to get ftp file according to Current date?

2017-04-24 Thread prabhu Mahendran
Pierre,

Thanks for your reply.

I have tried your suggestions.

i am having 1 lakhs files in FTP Directory.

For *ListFTP* It check all filenames and process into queue it could take
long time.

So it can take more time for fetch particular FTP file from lakhs.

Can you guide way to reduce time taken for fetch file?

On Fri, Apr 21, 2017 at 8:40 PM, Pierre Villard <pierre.villard...@gmail.com
> wrote:

> You could use the combination of ListFTP and FetchFTP (this is, most of
> the time, a better approach), and between the two processors you could do a
> RouteOnAttribute and only keep the flow files with the filename you are
> looking for.
>
> Pierre.
>
> 2017-04-21 13:29 GMT+02:00 prabhu Mahendran <prabhuu161...@gmail.com>:
>
>>  I have tried that "GetFTP" processor in which downloads file from FTP
>> accoding to the two attributes
>>
>> 1."FileFilterRegex" -Name of file in FTP
>> 2."RemotePath"-Path of an FTP file.
>>
>> I wants to download the File from FTP Server only if it having today's
>> date which is append with filename.
>>
>> *For example:*
>>
>> File name is *20170421TempFile.txt *
>> which is in FTP Server.
>>
>> Now i need to give that system date only to be append with filename.It
>> should get the current system date automatically *instead of given date
>> value directly*.
>>
>> So i have find that ${now()} gets the current system date but i cannot
>> give it in *"FileFilterRegex"* attribute due to it doesn't have
>> expression language support.
>>
>> Finally i need to get particular file with current date.
>>
>>  Anyone give some idea/guide me to achieve my requirements?
>>
>
>


How to get ftp file according to Current date?

2017-04-21 Thread prabhu Mahendran
 I have tried that "GetFTP" processor in which downloads file from FTP
accoding to the two attributes

1."FileFilterRegex" -Name of file in FTP
2."RemotePath"-Path of an FTP file.

I wants to download the File from FTP Server only if it having today's date
which is append with filename.

*For example:*

File name is *20170421TempFile.txt *
which is in FTP Server.

Now i need to give that system date only to be append with filename.It
should get the current system date automatically *instead of given date
value directly*.

So i have find that ${now()} gets the current system date but i cannot give
it in *"FileFilterRegex"* attribute due to it doesn't have expression
language support.

Finally i need to get particular file with current date.

 Anyone give some idea/guide me to achieve my requirements?


Re: How can datetime to month conversion failed in french language?

2017-04-12 Thread prabhu Mahendran
Jeff,

yes my data contains English names for month.

Andre,
It would be more interesting for specify locale argument with Expression
language.

After discussing,Are you will create task for it or not?


On Thu, Apr 13, 2017 at 2:33 AM, Andre <andre-li...@fucs.org> wrote:

> Hi,
>
> I suspect that at the moment the conversion between locales needs to be
> done manualy.
>
> It may be worth discussing with the community the ability to specify the
> locale argument when calling the Expression Language functions you referred
> to.
>
> Meanwhile, IF your other flows and processors allow, you may want to
> specify the locale NiFi uses to execute either via environment variables or
> using java command line parameters.
>
> Cheers
>
> On 12 Apr 2017 8:03 PM, "prabhu Mahendran" <prabhuu161...@gmail.com>
> wrote:
>
> output of the breakdown of the functions is 'Mai'.But in my local file
> contains 'May'. while processing 'May'(English) could be converted as
> 'Mai'(French).
>
> Is there is any expression language to convert French language into
> English?
>
> On Mon, Apr 10, 2017 at 8:02 PM, prabhu Mahendran <prabhuu161...@gmail.com
> > wrote:
>
>> I have store that result in another attribute using updateAttribute
>> processor.
>>
>> While incoming flowfiles into updateAttribute processor i have faced that
>> error.
>> On 10-Apr-2017 6:52 PM, "Andre" <andre-li...@fucs.org> wrote:
>>
>>> Prabhu,
>>>
>>> Thanks for the breakdown of the functions but what does
>>> *${input.4:substringBefore('-'):toDate('MMM')}* output? :-)
>>>
>>> May? Mai? something else?
>>>
>>> Cheers
>>>
>>> On Mon, Apr 10, 2017 at 10:39 PM, prabhu Mahendran <
>>> prabhuu161...@gmail.com> wrote:
>>>
>>>> Andre,
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> *1,12990,Mahe,May-17input.1->1input.2->12990input.3->Mahe.input.4->May-17*
>>>> *${input.4:substringBefore('-'):toDate('MMM')}*
>>>>
>>>> substringBefore('-')--> get the string portion before(' - ') symbol.It
>>>> returns result 'May'.
>>>>
>>>> toDate('MMM')-->Converts string into Month format.
>>>>
>>>> format('MM')-->It converts 'May' into Number like if Jan it should be
>>>> 01 then 'May it should be 05.
>>>>
>>>> Here i have convert month into number.
>>>>
>>>> Cheers
>>>>
>>>> On Mon, Apr 10, 2017 at 5:58 PM, Andre <andre-li...@fucs.org> wrote:
>>>>
>>>>> Prabhu,
>>>>>
>>>>> What is the output of *${input.4:substringBefore('-'):toDate('MMM')} *
>>>>> ?
>>>>>
>>>>> Cheers
>>>>>
>>>>> On Mon, Apr 10, 2017 at 3:15 PM, prabhu Mahendran <
>>>>> prabhuu161...@gmail.com> wrote:
>>>>>
>>>>>> Jeff,
>>>>>>
>>>>>> My actual data is in English(US).
>>>>>>
>>>>>> consider sample data,
>>>>>>
>>>>>>
>>>>>> *1,12990,Mahe,May-17*
>>>>>> In this line i have get "May-17" and split it as 'May' and '17'.
>>>>>>
>>>>>> Using below expression language..,
>>>>>>
>>>>>>
>>>>>>
>>>>>> *${input.4:substringBefore('-'):toDate('MMM'):format('MM')}*In above
>>>>>> query it could convert 'May' into '05' value.
>>>>>>
>>>>>> That can be work in my windows (English(US)).
>>>>>>
>>>>>> That Same query not work in French OS windows(French(Swiss)).
>>>>>>
>>>>>> It shows below error.
>>>>>>
>>>>>> *org.apache.nifi.expression.language.exception.IllegalAttributeExpression:Cannot
>>>>>> parse attribute value as date:dateformat:MMM;attribute value:Mai*
>>>>>>
>>>>>> In that exception it shows attribute value is 'Mai'.Those value is in
>>>>>> 'French' but i had given my data is in 'May' [english only].
>>>>>>
>>>>>> Can you suggest way to avoid this exception?
>>>>>>
>>>>>>
>>>>>> On Fri, Apr 7, 2017 at 11:40 PM, Jeff <jtsw...@gmail.com> wrote:
>>>>>>

Re: MaxFileSize in timeBasedFileNamingAndTriggeringPolicy not work?

2017-04-12 Thread prabhu Mahendran
For your information,I have try that same procedure in NiFi-1.1.1 also.

In that also NiFi have logback-classic-1.1.3 and logback-core-1.1.3 jar
only present.

In conf\logback.xml  MaxFileSize and maxHistory in *TimeBasedRollingPolicy*
not work.



On Wed, Apr 12, 2017 at 3:58 PM, prabhu Mahendran <prabhuu161...@gmail.com>
wrote:

> Jeff,
>
> Thanks for your mail.
>
> I need to use log back 1.1.7 in NiFi-0.6.1 .
>
> Can you suggest any way to change 1.1.3 into 1.1.7?
>
> On Wed, Apr 12, 2017 at 9:42 AM, Jeff <jtsw...@gmail.com> wrote:
>
>> Hello Prabhu,
>>
>> I think you're running into a logback bug [1] that is fixed with 1.1.7.
>> Unfortunately, it looks like NiFi 0.6.1 is using logback 1.1.3.
>>
>> [1] https://jira.qos.ch/browse/LOGBACK-747
>>
>> On Tue, Apr 11, 2017 at 8:37 AM prabhu Mahendran <prabhuu161...@gmail.com>
>> wrote:
>>
>>> In NiFi-0.6.1 ,i have try to reduce size of nifi-app.log to be stored in
>>> local directory.
>>>
>>> In that conf\logback.xml i have configured "MaxFileSize" to be 1MB.I
>>> think this only stores nifi-app.log should be under 1 MB Size only.But it
>>> doesn't do like that.It always store every logs.
>>>
>>> >> class="ch.qos.logback.core.rolling.RollingFileAppender">
>>> logs/nifi-app.log
>>> >> class="ch.qos.logback.core.rolling.TimeBasedRollingPolicy">
>>> 
>>> 
>>> ./logs/nifi-app_%d{-MM-dd_HH}.%i.log
>>> >> class="ch.qos.logback.core.rolling.SizeAndTimeBasedFNATP">
>>> 1MB
>>> 
>>> 
>>> 1
>>> 
>>> >> class="ch.qos.logback.classic.encoder.PatternLayoutEncoder">
>>> %date %level [%thread] %logger{40} %msg%n
>>> true
>>> 
>>> 
>>>
>>> Now i need to set 1MB for size of an nifi-app.log.
>>>
>>> *How to set size for nifi-app.log?*
>>>
>>
>


Re: MaxFileSize in timeBasedFileNamingAndTriggeringPolicy not work?

2017-04-12 Thread prabhu Mahendran
Jeff,

Thanks for your mail.

I need to use log back 1.1.7 in NiFi-0.6.1 .

Can you suggest any way to change 1.1.3 into 1.1.7?

On Wed, Apr 12, 2017 at 9:42 AM, Jeff <jtsw...@gmail.com> wrote:

> Hello Prabhu,
>
> I think you're running into a logback bug [1] that is fixed with 1.1.7.
> Unfortunately, it looks like NiFi 0.6.1 is using logback 1.1.3.
>
> [1] https://jira.qos.ch/browse/LOGBACK-747
>
> On Tue, Apr 11, 2017 at 8:37 AM prabhu Mahendran <prabhuu161...@gmail.com>
> wrote:
>
>> In NiFi-0.6.1 ,i have try to reduce size of nifi-app.log to be stored in
>> local directory.
>>
>> In that conf\logback.xml i have configured "MaxFileSize" to be 1MB.I
>> think this only stores nifi-app.log should be under 1 MB Size only.But it
>> doesn't do like that.It always store every logs.
>>
>> > class="ch.qos.logback.core.rolling.RollingFileAppender">
>> logs/nifi-app.log
>> > class="ch.qos.logback.core.rolling.TimeBasedRollingPolicy">
>> 
>> 
>> ./logs/nifi-app_%d{-MM-dd_HH}.%i.log
>> > class="ch.qos.logback.core.rolling.SizeAndTimeBasedFNATP">
>> 1MB
>> 
>> 
>> 1
>> 
>> 
>> %date %level [%thread] %logger{40} %msg%n
>> true
>> 
>> 
>>
>> Now i need to set 1MB for size of an nifi-app.log.
>>
>> *How to set size for nifi-app.log?*
>>
>


Re: How can datetime to month conversion failed in french language?

2017-04-12 Thread prabhu Mahendran
output of the breakdown of the functions is 'Mai'.But in my local file
contains 'May'. while processing 'May'(English) could be converted as
'Mai'(French).

Is there is any expression language to convert French language into English?

On Mon, Apr 10, 2017 at 8:02 PM, prabhu Mahendran <prabhuu161...@gmail.com>
wrote:

> I have store that result in another attribute using updateAttribute
> processor.
>
> While incoming flowfiles into updateAttribute processor i have faced that
> error.
> On 10-Apr-2017 6:52 PM, "Andre" <andre-li...@fucs.org> wrote:
>
>> Prabhu,
>>
>> Thanks for the breakdown of the functions but what does
>> *${input.4:substringBefore('-'):toDate('MMM')}* output? :-)
>>
>> May? Mai? something else?
>>
>> Cheers
>>
>> On Mon, Apr 10, 2017 at 10:39 PM, prabhu Mahendran <
>> prabhuu161...@gmail.com> wrote:
>>
>>> Andre,
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> *1,12990,Mahe,May-17input.1->1input.2->12990input.3->Mahe.input.4->May-17*
>>> *${input.4:substringBefore('-'):toDate('MMM')}*
>>>
>>> substringBefore('-')--> get the string portion before(' - ') symbol.It
>>> returns result 'May'.
>>>
>>> toDate('MMM')-->Converts string into Month format.
>>>
>>> format('MM')-->It converts 'May' into Number like if Jan it should be 01
>>> then 'May it should be 05.
>>>
>>> Here i have convert month into number.
>>>
>>> Cheers
>>>
>>> On Mon, Apr 10, 2017 at 5:58 PM, Andre <andre-li...@fucs.org> wrote:
>>>
>>>> Prabhu,
>>>>
>>>> What is the output of *${input.4:substringBefore('-'):toDate('MMM')} *?
>>>>
>>>> Cheers
>>>>
>>>> On Mon, Apr 10, 2017 at 3:15 PM, prabhu Mahendran <
>>>> prabhuu161...@gmail.com> wrote:
>>>>
>>>>> Jeff,
>>>>>
>>>>> My actual data is in English(US).
>>>>>
>>>>> consider sample data,
>>>>>
>>>>>
>>>>> *1,12990,Mahe,May-17*
>>>>> In this line i have get "May-17" and split it as 'May' and '17'.
>>>>>
>>>>> Using below expression language..,
>>>>>
>>>>>
>>>>>
>>>>> *${input.4:substringBefore('-'):toDate('MMM'):format('MM')}*In above
>>>>> query it could convert 'May' into '05' value.
>>>>>
>>>>> That can be work in my windows (English(US)).
>>>>>
>>>>> That Same query not work in French OS windows(French(Swiss)).
>>>>>
>>>>> It shows below error.
>>>>>
>>>>> *org.apache.nifi.expression.language.exception.IllegalAttributeExpression:Cannot
>>>>> parse attribute value as date:dateformat:MMM;attribute value:Mai*
>>>>>
>>>>> In that exception it shows attribute value is 'Mai'.Those value is in
>>>>> 'French' but i had given my data is in 'May' [english only].
>>>>>
>>>>> Can you suggest way to avoid this exception?
>>>>>
>>>>>
>>>>> On Fri, Apr 7, 2017 at 11:40 PM, Jeff <jtsw...@gmail.com> wrote:
>>>>>
>>>>>> Prabhu,
>>>>>>
>>>>>> I'll have to try this in NiFi myself.  I'll let you know what I
>>>>>> find.  What is the result of the EL you're using when you are trying it
>>>>>> with French?
>>>>>>
>>>>>> On Fri, Apr 7, 2017 at 1:03 AM prabhu Mahendran <
>>>>>> prabhuu161...@gmail.com> wrote:
>>>>>>
>>>>>>> jeff,
>>>>>>>
>>>>>>> Thanks for your reply.
>>>>>>>
>>>>>>> Attribute 'ds' having the '07/04/2017'.
>>>>>>>
>>>>>>> And  convert that into month using UpdateAttribute.
>>>>>>>
>>>>>>> ${ds:toDate('dd/MM/'):format('MMM')}.
>>>>>>>
>>>>>>> if i use that code in windows having language English(India) then it
>>>>>>> worked.
>>>>>>>
>>>>>>> If i use that code in windows having language French(OS) it couldn't
>>>>>>> work.
>>>>>>>
>>>>>>> Can you suggest any way to solve that problem?
>>>>>>>
>>>>>>> On Fri, Apr 7, 2017 at 1:28 AM, Jeff <jtsw...@gmail.com> wrote:
>>>>>>>
>>>>>>> What is the expression language statement that you're attempting to
>>>>>>> use?
>>>>>>>
>>>>>>> On Thu, Apr 6, 2017 at 3:12 AM prabhu Mahendran <
>>>>>>> prabhuu161...@gmail.com> wrote:
>>>>>>>
>>>>>>> In NiFi How JVM Check language of machine?
>>>>>>>
>>>>>>> is that take any default language like English(US) else System
>>>>>>> DateTime Selected language?
>>>>>>>
>>>>>>> I face issue while converting datetime format into Month using
>>>>>>> expression language with NiFi package installed with French OS.
>>>>>>>
>>>>>>> But it worked in English(US) Selected language.
>>>>>>>
>>>>>>> Can anyone help me to resolve this?
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>
>>>>
>>>
>>


MaxFileSize in timeBasedFileNamingAndTriggeringPolicy not work?

2017-04-11 Thread prabhu Mahendran
In NiFi-0.6.1 ,i have try to reduce size of nifi-app.log to be stored in
local directory.

In that conf\logback.xml i have configured "MaxFileSize" to be 1MB.I think
this only stores nifi-app.log should be under 1 MB Size only.But it doesn't
do like that.It always store every logs.


logs/nifi-app.log



./logs/nifi-app_%d{-MM-dd_HH}.%i.log

1MB


1


%date %level [%thread] %logger{40} %msg%n
true



Now i need to set 1MB for size of an nifi-app.log.

*How to set size for nifi-app.log?*


Re: How can datetime to month conversion failed in french language?

2017-04-10 Thread prabhu Mahendran
Andre,







*1,12990,Mahe,May-17input.1->1input.2->12990input.3->Mahe.input.4->May-17*
*${input.4:substringBefore('-'):toDate('MMM')}*

substringBefore('-')--> get the string portion before(' - ') symbol.It
returns result 'May'.

toDate('MMM')-->Converts string into Month format.

format('MM')-->It converts 'May' into Number like if Jan it should be 01
then 'May it should be 05.

Here i have convert month into number.

Cheers

On Mon, Apr 10, 2017 at 5:58 PM, Andre <andre-li...@fucs.org> wrote:

> Prabhu,
>
> What is the output of *${input.4:substringBefore('-'):toDate('MMM')} *?
>
> Cheers
>
> On Mon, Apr 10, 2017 at 3:15 PM, prabhu Mahendran <prabhuu161...@gmail.com
> > wrote:
>
>> Jeff,
>>
>> My actual data is in English(US).
>>
>> consider sample data,
>>
>>
>> *1,12990,Mahe,May-17*
>> In this line i have get "May-17" and split it as 'May' and '17'.
>>
>> Using below expression language..,
>>
>>
>>
>> *${input.4:substringBefore('-'):toDate('MMM'):format('MM')}*In above
>> query it could convert 'May' into '05' value.
>>
>> That can be work in my windows (English(US)).
>>
>> That Same query not work in French OS windows(French(Swiss)).
>>
>> It shows below error.
>>
>> *org.apache.nifi.expression.language.exception.IllegalAttributeExpression:Cannot
>> parse attribute value as date:dateformat:MMM;attribute value:Mai*
>>
>> In that exception it shows attribute value is 'Mai'.Those value is in
>> 'French' but i had given my data is in 'May' [english only].
>>
>> Can you suggest way to avoid this exception?
>>
>>
>> On Fri, Apr 7, 2017 at 11:40 PM, Jeff <jtsw...@gmail.com> wrote:
>>
>>> Prabhu,
>>>
>>> I'll have to try this in NiFi myself.  I'll let you know what I find.
>>> What is the result of the EL you're using when you are trying it with
>>> French?
>>>
>>> On Fri, Apr 7, 2017 at 1:03 AM prabhu Mahendran <prabhuu161...@gmail.com>
>>> wrote:
>>>
>>>> jeff,
>>>>
>>>> Thanks for your reply.
>>>>
>>>> Attribute 'ds' having the '07/04/2017'.
>>>>
>>>> And  convert that into month using UpdateAttribute.
>>>>
>>>> ${ds:toDate('dd/MM/yyyy'):format('MMM')}.
>>>>
>>>> if i use that code in windows having language English(India) then it
>>>> worked.
>>>>
>>>> If i use that code in windows having language French(OS) it couldn't
>>>> work.
>>>>
>>>> Can you suggest any way to solve that problem?
>>>>
>>>> On Fri, Apr 7, 2017 at 1:28 AM, Jeff <jtsw...@gmail.com> wrote:
>>>>
>>>> What is the expression language statement that you're attempting to use?
>>>>
>>>> On Thu, Apr 6, 2017 at 3:12 AM prabhu Mahendran <
>>>> prabhuu161...@gmail.com> wrote:
>>>>
>>>> In NiFi How JVM Check language of machine?
>>>>
>>>> is that take any default language like English(US) else System DateTime
>>>> Selected language?
>>>>
>>>> I face issue while converting datetime format into Month using
>>>> expression language with NiFi package installed with French OS.
>>>>
>>>> But it worked in English(US) Selected language.
>>>>
>>>> Can anyone help me to resolve this?
>>>>
>>>>
>>>>
>>
>


Re: How can datetime to month conversion failed in french language?

2017-04-09 Thread prabhu Mahendran
Jeff,

My actual data is in English(US).

consider sample data,


*1,12990,Mahe,May-17*
In this line i have get "May-17" and split it as 'May' and '17'.

Using below expression language..,



*${input.4:substringBefore('-'):toDate('MMM'):format('MM')}*In above query
it could convert 'May' into '05' value.

That can be work in my windows (English(US)).

That Same query not work in French OS windows(French(Swiss)).

It shows below error.

*org.apache.nifi.expression.language.exception.IllegalAttributeExpression:Cannot
parse attribute value as date:dateformat:MMM;attribute value:Mai*

In that exception it shows attribute value is 'Mai'.Those value is in
'French' but i had given my data is in 'May' [english only].

Can you suggest way to avoid this exception?


On Fri, Apr 7, 2017 at 11:40 PM, Jeff <jtsw...@gmail.com> wrote:

> Prabhu,
>
> I'll have to try this in NiFi myself.  I'll let you know what I find.
> What is the result of the EL you're using when you are trying it with
> French?
>
> On Fri, Apr 7, 2017 at 1:03 AM prabhu Mahendran <prabhuu161...@gmail.com>
> wrote:
>
>> jeff,
>>
>> Thanks for your reply.
>>
>> Attribute 'ds' having the '07/04/2017'.
>>
>> And  convert that into month using UpdateAttribute.
>>
>> ${ds:toDate('dd/MM/'):format('MMM')}.
>>
>> if i use that code in windows having language English(India) then it
>> worked.
>>
>> If i use that code in windows having language French(OS) it couldn't work.
>>
>> Can you suggest any way to solve that problem?
>>
>> On Fri, Apr 7, 2017 at 1:28 AM, Jeff <jtsw...@gmail.com> wrote:
>>
>> What is the expression language statement that you're attempting to use?
>>
>> On Thu, Apr 6, 2017 at 3:12 AM prabhu Mahendran <prabhuu161...@gmail.com>
>> wrote:
>>
>> In NiFi How JVM Check language of machine?
>>
>> is that take any default language like English(US) else System DateTime
>> Selected language?
>>
>> I face issue while converting datetime format into Month using expression
>> language with NiFi package installed with French OS.
>>
>> But it worked in English(US) Selected language.
>>
>> Can anyone help me to resolve this?
>>
>>
>>


Re: How can datetime to month conversion failed in french language?

2017-04-06 Thread prabhu Mahendran
jeff,

Thanks for your reply.

Attribute 'ds' having the '07/04/2017'.

And  convert that into month using UpdateAttribute.

${ds:toDate('dd/MM/'):format('MMM')}.

if i use that code in windows having language English(India) then it worked.

If i use that code in windows having language French(OS) it couldn't work.

Can you suggest any way to solve that problem?

On Fri, Apr 7, 2017 at 1:28 AM, Jeff <jtsw...@gmail.com> wrote:

> What is the expression language statement that you're attempting to use?
>
> On Thu, Apr 6, 2017 at 3:12 AM prabhu Mahendran <prabhuu161...@gmail.com>
> wrote:
>
>> In NiFi How JVM Check language of machine?
>>
>> is that take any default language like English(US) else System DateTime
>> Selected language?
>>
>> I face issue while converting datetime format into Month using expression
>> language with NiFi package installed with French OS.
>>
>> But it worked in English(US) Selected language.
>>
>> Can anyone help me to resolve this?
>>
>


How can datetime to month conversion failed in french language?

2017-04-06 Thread prabhu Mahendran
In NiFi How JVM Check language of machine?

is that take any default language like English(US) else System DateTime
Selected language?

I face issue while converting datetime format into Month using expression
language with NiFi package installed with French OS.

But it worked in English(US) Selected language.

Can anyone help me to resolve this?


Is this possible to store the conf/archive directory in HDFS or FTP?

2017-03-24 Thread prabhu Mahendran
i have try to store backup operations for nifi in HDFS or FTP Directory.

If flow has been any changes then it could be stored in
"%NIFI_HOME%/conf/archive" and "%NIFI_HOME%/conf/flow.xml.gz"

Now i try to store that changes in ftp directory.

If any configuration has changes it will also stored in FTP Directory.

By configure the conf/archive

nifi.flow.configuration.file=ftp:/conf/flow.xml.gz
nifi.flow.configuration.archive.dir=ftp:/conf/archive/
But i cannot able to store it.It leads InvalidPathException and nifi
couldn't start.

Is there is any way to store the archive backup operation in Both local
drive and FTP directory?


Re: Cannot get a connection, pool error Timeout waiting for idle object in PutSQL?

2017-03-22 Thread prabhu Mahendran
Matt,

Thanks for your suggestion.It worked.

I have set 4GB as heap memory for NiFi.

And use approximately 30 concurrent tasks in my flow.

It boost up my NiFi flow.But it uses 100% CPU Usage.

If i haven't use concurrent tasks then CPU usage is normal.

Is there is any way to reduce or avoid the CPU Usage?

On Wed, Mar 22, 2017 at 8:05 PM, Matt Burgess <mattyb...@apache.org> wrote:

> I just noticed this answer on SO as well [1].
>
> Regards,
> Matt
>
> [1] http://stackoverflow.com/questions/42942759/cannot-get-
> a-connection-pool-error-timeout-waiting-for-idle-object-in-putsql
>
> On Wed, Mar 22, 2017 at 10:33 AM, Matt Burgess <mattyb...@apache.org>
> wrote:
> > Prabhu,
> >
> > What are your settings for the DBCPConnectionPool controller service?
> > The defaults are 8 Max Connections and 500 milliseconds for Max Wait
> > Time. For 10 concurrent PutSQL tasks, the first 8 will likely get
> > connections, and if none are returned in 500 milliseconds, then one of
> > the other tasks will not get a connection, leading to the error you
> > see above.
> >
> > I recommend setting Max Connections as high as is prudent (at least
> > the number of concurrent tasks using the controller service), and
> > perhaps extending to Max Wait Time to 1 second or more, depending on
> > how long you are willing for a task to wait for a connection to be
> > returned to the pool by some other task.
> >
> > Regards,
> > Matt
> >
> >
> > On Wed, Mar 22, 2017 at 12:43 AM, prabhu Mahendran
> > <prabhuu161...@gmail.com> wrote:
> >> I have increased the concurrent tasks to be '10' for PutSQL processor.
> >>
> >> At that time it shows below error but there is no data loss.
> >>
> >> failed to process due to
> >> org.apache.nifi.processor.exception.ProcessException:
> >> org.apache.commons.dbcp.SQLNestedException: Cannot get a connection,
> pool
> >> error Timeout waiting for idle object; rolling back session:
> >>
> >> if i have remove concurrent tasks then it worked without those exception
> >>
> >> while google this exception i have found answer in below link
> >>
> >> I am getting Cannot get a connection, pool error Timeout waiting for
> idle
> >> object, When I try to create more than 250 threads in my web application
> >>
> >> But i don't know how to avoid this issue in NiFi putSQL.
> >>
> >> Can anyone help me to resolve this?
> >>
>


Re: How to access Controller Service created in UI into Root processors in NiFi-1.1.1?

2017-03-22 Thread prabhu Mahendran
No i am not running secure NiFi instance.Only viewed the url
(http://:/nifi)
in windows 8.1

On Wed, Mar 22, 2017 at 8:43 PM, Bryan Bende <bbe...@gmail.com> wrote:

> Are you running a secure NiFi instance with certificates over https?
>
> On Wed, Mar 22, 2017 at 11:02 AM, prabhu Mahendran <
> prabhuu161...@gmail.com> wrote:
>
>> I am always use google chrome in windows to view the Nifi UI.
>>
>> And i cannot able to see the access policies in chrome also.
>> On 22-Mar-2017 5:56 PM, "Matt Gilman" <matt.c.gil...@gmail.com> wrote:
>>
>>> Hi,
>>>
>>> What browser are you using? We had some issues the action icons in
>>> Firefox in NiFi 1.1.0 wrapping to the next line that is addressed in this
>>> JIRA [1]. If you are using Firefox, can you try Chrome to help narrow down
>>> the issue? Does the policy links show up in the Controller Services tab?
>>>
>>> Thanks
>>>
>>> Matt
>>>
>>> [1] https://issues.apache.org/jira/browse/NIFI-3167
>>>
>>> On Wed, Mar 22, 2017 at 2:16 AM, prabhu Mahendran <
>>> prabhuu161...@gmail.com> wrote:
>>>
>>>> I haven't see the access policies in "Reporting Tasks".Look at below
>>>> image.
>>>>
>>>> ​
>>>> ​But in your referral link i can see access policies in controller
>>>> services tab.https://nifi.apache.org/docs/nifi-docs/html/user-guide.h
>>>> tml#Controller_Services_for_Reporting_Tasks
>>>>
>>>> On Sat, Mar 18, 2017 at 12:08 AM, Bryan Bende <bbe...@gmail.com> wrote:
>>>>
>>>>> Hello,
>>>>>
>>>>> I responded on stackoverflow:
>>>>>
>>>>> https://stackoverflow.com/questions/42853055/how-to-access-c
>>>>> ontroller-service-created-in-ui-into-root-processors-in-nifi-1-1
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Bryan
>>>>>
>>>>>
>>>>> On Fri, Mar 17, 2017 at 5:09 AM, prabhu Mahendran
>>>>> <prabhuu161...@gmail.com> wrote:
>>>>> >
>>>>> > Nifi-1.1.0:
>>>>> >
>>>>> > In nifi-1.1.0 i have attached created controller services in
>>>>> (Image1.png)
>>>>> > Root canvas.
>>>>> >
>>>>> > After creating Controller services i have drag and drop the putSQL in
>>>>> > UI.(Image2.png)
>>>>> >
>>>>> > In JDBC Connection Pool,it can show created controllerservice.
>>>>> >
>>>>> >
>>>>> > But in NiFi-1.1.1 i do same procedure but ConnectionPool doesn't
>>>>> shown.
>>>>> >
>>>>> > Nifi-1.1.1:
>>>>> >
>>>>> > 1.i have created controller service attached image (Image3.png)
>>>>> >
>>>>> > 2.Drag the PutSQL in Root canvas.It couldn't show already created
>>>>> service.
>>>>> > (Image4.png)
>>>>> >
>>>>> > Note: I have attached all images in zip file.
>>>>> >
>>>>> > i don't know exactly is this correct behaviour for NiFi-1.1.1?
>>>>> >
>>>>> > How can i access the created controller service in Root into Root
>>>>> > Processors?
>>>>> >
>>>>> > can anyone explain concepts behind controller service creation in
>>>>> Nifi-1.1.1
>>>>> > root canvas?
>>>>>
>>>>
>>>>
>>>
>


Re: How to access Controller Service created in UI into Root processors in NiFi-1.1.1?

2017-03-22 Thread prabhu Mahendran
I am always use google chrome in windows to view the Nifi UI.

And i cannot able to see the access policies in chrome also.
On 22-Mar-2017 5:56 PM, "Matt Gilman" <matt.c.gil...@gmail.com> wrote:

> Hi,
>
> What browser are you using? We had some issues the action icons in Firefox
> in NiFi 1.1.0 wrapping to the next line that is addressed in this JIRA [1].
> If you are using Firefox, can you try Chrome to help narrow down the issue?
> Does the policy links show up in the Controller Services tab?
>
> Thanks
>
> Matt
>
> [1] https://issues.apache.org/jira/browse/NIFI-3167
>
> On Wed, Mar 22, 2017 at 2:16 AM, prabhu Mahendran <prabhuu161...@gmail.com
> > wrote:
>
>> I haven't see the access policies in "Reporting Tasks".Look at below
>> image.
>>
>> ​
>> ​But in your referral link i can see access policies in controller
>> services tab.https://nifi.apache.org/docs/nifi-docs/html/user-guide.
>> html#Controller_Services_for_Reporting_Tasks
>>
>> On Sat, Mar 18, 2017 at 12:08 AM, Bryan Bende <bbe...@gmail.com> wrote:
>>
>>> Hello,
>>>
>>> I responded on stackoverflow:
>>>
>>> https://stackoverflow.com/questions/42853055/how-to-access-c
>>> ontroller-service-created-in-ui-into-root-processors-in-nifi-1-1
>>>
>>> Thanks,
>>>
>>> Bryan
>>>
>>>
>>> On Fri, Mar 17, 2017 at 5:09 AM, prabhu Mahendran
>>> <prabhuu161...@gmail.com> wrote:
>>> >
>>> > Nifi-1.1.0:
>>> >
>>> > In nifi-1.1.0 i have attached created controller services in
>>> (Image1.png)
>>> > Root canvas.
>>> >
>>> > After creating Controller services i have drag and drop the putSQL in
>>> > UI.(Image2.png)
>>> >
>>> > In JDBC Connection Pool,it can show created controllerservice.
>>> >
>>> >
>>> > But in NiFi-1.1.1 i do same procedure but ConnectionPool doesn't shown.
>>> >
>>> > Nifi-1.1.1:
>>> >
>>> > 1.i have created controller service attached image (Image3.png)
>>> >
>>> > 2.Drag the PutSQL in Root canvas.It couldn't show already created
>>> service.
>>> > (Image4.png)
>>> >
>>> > Note: I have attached all images in zip file.
>>> >
>>> > i don't know exactly is this correct behaviour for NiFi-1.1.1?
>>> >
>>> > How can i access the created controller service in Root into Root
>>> > Processors?
>>> >
>>> > can anyone explain concepts behind controller service creation in
>>> Nifi-1.1.1
>>> > root canvas?
>>>
>>
>>
>


Re: How to access Controller Service created in UI into Root processors in NiFi-1.1.1?

2017-03-22 Thread prabhu Mahendran
I haven't see the access policies in "Reporting Tasks".Look at below image.

​
​But in your referral link i can see access policies in controller services
tab.
https://nifi.apache.org/docs/nifi-docs/html/user-guide.html#Controller_Services_for_Reporting_Tasks

On Sat, Mar 18, 2017 at 12:08 AM, Bryan Bende <bbe...@gmail.com> wrote:

> Hello,
>
> I responded on stackoverflow:
>
> https://stackoverflow.com/questions/42853055/how-to-
> access-controller-service-created-in-ui-into-root-processors-in-nifi-1-1
>
> Thanks,
>
> Bryan
>
>
> On Fri, Mar 17, 2017 at 5:09 AM, prabhu Mahendran
> <prabhuu161...@gmail.com> wrote:
> >
> > Nifi-1.1.0:
> >
> > In nifi-1.1.0 i have attached created controller services in (Image1.png)
> > Root canvas.
> >
> > After creating Controller services i have drag and drop the putSQL in
> > UI.(Image2.png)
> >
> > In JDBC Connection Pool,it can show created controllerservice.
> >
> >
> > But in NiFi-1.1.1 i do same procedure but ConnectionPool doesn't shown.
> >
> > Nifi-1.1.1:
> >
> > 1.i have created controller service attached image (Image3.png)
> >
> > 2.Drag the PutSQL in Root canvas.It couldn't show already created
> service.
> > (Image4.png)
> >
> > Note: I have attached all images in zip file.
> >
> > i don't know exactly is this correct behaviour for NiFi-1.1.1?
> >
> > How can i access the created controller service in Root into Root
> > Processors?
> >
> > can anyone explain concepts behind controller service creation in
> Nifi-1.1.1
> > root canvas?
>


Re: How to specify priority attributes for seperate flowfiles?

2017-03-06 Thread prabhu Mahendran
Bryan & Andre,

PriorityAttributePrioritizer ,FIFO strategy worked if flow has no loop
processing.

But i have configured loop processing in my workflow by using this
reference.

https://gist.github.com/ijokarumawak/01c4fd2d9291d3e74ec424a581659ca8#file-loop_sample-xml

For example  if i process two files using getFile processor then content of
those flow files get shuffled after looping of file.

That's why i can't prioritize my flow files.

Now i need to combine flow files if it having same file name.

*use case: *consider two files named file1 and file2. i have get those two
files and processing it after some changes in contents then perform loop
flow to iterate same modifications for all files.Hereafter queue having
some shuffled flowfiles.

i try to combine flow files based on name of file.Check filename and then
combine all flowfiles having same name into one flowfile.

Can you suggest any way to perform my requirement?

Thanks in Advance,

On Fri, Mar 3, 2017 at 8:08 PM, Bryan Bende <bbe...@gmail.com> wrote:

> What Andre described is what I had in mind as well...
>
> One thing to keep in mind is that I think you can only guarantee the
> ordering if all the files you want to process are picked up in one
> execution of GetFile.
>
> For example, imagine there are 100 files in the directory, and
> GetFile's Batch Size is set to 10 (the default). The first time
> GetFile executes it is going to get 10 out of the 100 flow files, and
> then using Andre's example with the epoch as the priority, you can get
> those 10 flow files processed in order.
>
> If you were trying to get total order across all 100 files, you would
> either need the batch size to be greater than the total number of
> files, or you would need some kind of custom processor that waited for
> N flow files, and then if the queue before that processor used the
> PriorityAttributePrioritizer, then you would be waiting until all 100
> flow files were in the queue in priority order before letting any of
> them process.
>
>
>
>
> On Fri, Mar 3, 2017 at 2:59 AM, Andre <andre-li...@fucs.org> wrote:
> > Prabhu,
> >
> > I suspect you need to rethink your use of concurrency on your workflow. I
> > give you an example:
> >
> > You spoke about 10 concurrent GetFile threads reading a repository and
> their
> > consequent ordering:
> >
> > Suppose you have 2 threads consuming:
> >
> > file1 - 10 MB
> > file2 - 20 MB
> > file3 - 50 MB
> > file4 - 10 MB
> > file5 - 10 MB
> > file6 - 10 MB
> >
> > All things equal, consider each of the 2 threads consume and dispatch the
> > files at the same speed. How can you guarantee that thread 1 will consume
> > file5 (i.e. as in t1-f1, t2-f2, t1-f3, t2-f4, t1-f5, t2-f6)?
> >
> > Or as Brandon DeVries clearly put a lojng while ago[1]:
> >
> > "Just because a FlowFile begins processing first doesn't mean it will
> > complete first (assuming the processor has multiple concurrent tasks)"
> >
> > Brandon goes further and provides some suggestions that may help you
> binning
> > your flowfiles and records together, but in any case...
> >
> >
> > Assuming the filename is named based on a date (e.g.
> > file_2017-03-03T010101.csv), have you considered using UpdateAttributes
> to
> > parse the filename into a date, that date into Epoch (which happens to
> be an
> > increasing number) as a first level index / prioritizer?
> >
> > This way you could have:
> >
> > GetFile (single thread) -- Connector with FIFO --> UpdateAttribute
> (adding
> > Epoch from filename date) -- Connector with PriorityAttributePrioritizer
> -->
> > rest of your flow
> >
> >
> > Once again, assuming the file name is file_2017-03-03T010101.csv, the
> > expression language would be something like:
> >
> > ${filename:toDate("'file_'-MM-dd'T'HHmmss'.csv'", "UTC"):toNumber()}
> >
> >
> > Would that help?
> >
> >
> > [1]
> > https://lists.apache.org/thread.html/203ddc0423ac7f877817ad5e2b389f
> 079c2a27d8d4b4ef998ad91a32@1449844053@%3Cdev.nifi.apache.org%3E
> >
> >
> > On 3 Mar 2017 5:27 PM, "prabhu Mahendran" <prabhuu161...@gmail.com>
> wrote:
> >>
> >> This task(NIFI-470) suits to some of the workflow. If I set concurrent
> >> task to 10, records runs in parallel so that each file gets shuffled as
> I
> >> can see in the List Queue.
> >>
> >>
> >>
> >> If we get order of files from the Getfile, How I can ensure the data
> from
> >> each file is properly moved to destination(con

Re: How to specify priority attributes for seperate flowfiles?

2017-03-02 Thread prabhu Mahendran
This task(NIFI-470) suits to some of the workflow. If I set concurrent task
to 10, records runs in parallel so that each file gets shuffled as I can
see in the List Queue.



If we get order of files from the Getfile, How I can ensure the data from
each file is properly moved to destination(consider SQL) in same order with
respect to concurrent task also?



I need flow like this: Consider file1 has 10 records and it should be
priortized from the value 1 to 10, then next file2 records should start
with the priority value 11 to so on.. Filename can be in the order of the
date from the getfile processor. Here I can ensure each ordered files are
moved in the same order into SQL.



Will this be achieved in the ticket or any suggestion for this?

On Fri, Mar 3, 2017 at 11:37 AM, Andre <andre-li...@fucs.org> wrote:

> Hi,
>
> There's an existing JIRA ticket(NIFI-470) requesting a way to allow a DFM
> to fine tune how GetFile build it's queues and control how to prioritise
> the consumption of files.
>
> Would that be what you are looking after?
>
> Cheers
>
>
> On 3 Mar 2017 15:55, "prabhu Mahendran" <prabhuu161...@gmail.com> wrote:
>
> Yes, exactly you got my point.
>
>
>
> Consider the filename contains date, how to prioritze the files from the
> directory to come first based on the date(oldest date comes first to the
> latest date comes last)?
>
>
>
> Issue faced here: Consider I have 2 files in the directory, after the
> GetFile->SplitText->ExtractText, I used priority attribute in
> UpdateAttribute. Now each file is initalized with priority 1...10. For
> file1, each records has 1 to 10 priority value, similarly for file2, each
> records has 1 to 10 priority value. Actually I want input files to be
> prioritized based on date in the filename?  So that finally, oldest date
> records will be processed first and then the latest date records.
>
>
>
> On Thu, Mar 2, 2017 at 6:39 PM, Bryan Bende <bbe...@gmail.com> wrote:
>
>> So in your example you are saying that 10 files get placed in a
>> directory, and inside each of those 10 files the data is already
>> ordered the way you want, but you want to ensure the 10 files get
>> processed in a specific order?
>>
>> If that is true, what determines the order of the 10 files? is it
>> based on the order they were written to the directory? or is there
>> something in the filename that indicates which file comes first? In
>> order for NiFi to prioritize these files, there has to be something
>> that tells NiFi what the priority is.
>>
>> On Wed, Mar 1, 2017 at 11:56 PM, prabhu Mahendran
>> <prabhuu161...@gmail.com> wrote:
>> > As you suggested, setting 3 UpdateAttribute may be tedious. Suppose I
>> have
>> > more than 10 flowfiles setting 10 updateattribute processor is lengthy
>> one.
>> > This case also not possible for dynamically generating flowfiles.
>> >
>> >
>> >
>> > How to set priority attribute for the flowfiles from Getfile? Suppose I
>> get
>> > 10 files in the Getfile processor, based on my priority I have ordered
>> the
>> > flowfile each line in the files till PutSQL. Here without considering
>> the
>> > order, based on the filecreation time, data is moved without my ordered
>> > records. For this case only I decided with the
>> PriorityAttributePrioritizer
>> > and used UpdateAttribute processor.
>> >
>> >
>> >
>> > I can able to set the priority attribute for each line in the file, but
>> not
>> > each files from GetFile. Can you suggest any solution?
>> >
>> >
>> >
>> >
>> > On Wed, Mar 1, 2017 at 7:18 PM, Bryan Bende <bbe...@gmail.com> wrote:
>> >>
>> >> I just responded to this question on stackoverflow:
>> >>
>> >>
>> >> https://stackoverflow.com/questions/42528993/how-to-specify-
>> priority-attributes-for-seperate-flowfiles
>> >>
>> >> Thanks,
>> >>
>> >> Bryan
>> >>
>> >> On Wed, Mar 1, 2017 at 5:19 AM, prabhu Mahendran
>> >> <prabhuu161...@gmail.com> wrote:
>> >> > I need to use PrioritizeAttributePrioritizer in NiFi.
>> >> >
>> >> > i have observed that prioritizers in below reference.
>> >> > https://nifi.apache.org/docs/nifi-docs/html/user-guide.html#settings
>> >> >
>> >> > if i receive 10 flowfiles then i need to set the priority value for
>> >> > every
>> >> > flow file to be unique.
>> >> >
>> >> > After that specify queue configuration must be
>> >> > PrioritizeAttributePrioritizer.
>> >> >
>> >> > Then processing flowfiles based on priority value.
>> >> >
>> >> > How can i set priority value for seperate flow files or which
>> >> > prioritizer in
>> >> > Nifi to be work for my case?
>> >
>> >
>>
>
>
>


Re: How to specify priority attributes for seperate flowfiles?

2017-03-02 Thread prabhu Mahendran
Yes, exactly you got my point.



Consider the filename contains date, how to prioritze the files from the
directory to come first based on the date(oldest date comes first to the
latest date comes last)?



Issue faced here: Consider I have 2 files in the directory, after the
GetFile->SplitText->ExtractText, I used priority attribute in
UpdateAttribute. Now each file is initalized with priority 1...10. For
file1, each records has 1 to 10 priority value, similarly for file2, each
records has 1 to 10 priority value. Actually I want input files to be
prioritized based on date in the filename?  So that finally, oldest date
records will be processed first and then the latest date records.



On Thu, Mar 2, 2017 at 6:39 PM, Bryan Bende <bbe...@gmail.com> wrote:

> So in your example you are saying that 10 files get placed in a
> directory, and inside each of those 10 files the data is already
> ordered the way you want, but you want to ensure the 10 files get
> processed in a specific order?
>
> If that is true, what determines the order of the 10 files? is it
> based on the order they were written to the directory? or is there
> something in the filename that indicates which file comes first? In
> order for NiFi to prioritize these files, there has to be something
> that tells NiFi what the priority is.
>
> On Wed, Mar 1, 2017 at 11:56 PM, prabhu Mahendran
> <prabhuu161...@gmail.com> wrote:
> > As you suggested, setting 3 UpdateAttribute may be tedious. Suppose I
> have
> > more than 10 flowfiles setting 10 updateattribute processor is lengthy
> one.
> > This case also not possible for dynamically generating flowfiles.
> >
> >
> >
> > How to set priority attribute for the flowfiles from Getfile? Suppose I
> get
> > 10 files in the Getfile processor, based on my priority I have ordered
> the
> > flowfile each line in the files till PutSQL. Here without considering the
> > order, based on the filecreation time, data is moved without my ordered
> > records. For this case only I decided with the
> PriorityAttributePrioritizer
> > and used UpdateAttribute processor.
> >
> >
> >
> > I can able to set the priority attribute for each line in the file, but
> not
> > each files from GetFile. Can you suggest any solution?
> >
> >
> >
> >
> > On Wed, Mar 1, 2017 at 7:18 PM, Bryan Bende <bbe...@gmail.com> wrote:
> >>
> >> I just responded to this question on stackoverflow:
> >>
> >>
> >> https://stackoverflow.com/questions/42528993/how-to-
> specify-priority-attributes-for-seperate-flowfiles
> >>
> >> Thanks,
> >>
> >> Bryan
> >>
> >> On Wed, Mar 1, 2017 at 5:19 AM, prabhu Mahendran
> >> <prabhuu161...@gmail.com> wrote:
> >> > I need to use PrioritizeAttributePrioritizer in NiFi.
> >> >
> >> > i have observed that prioritizers in below reference.
> >> > https://nifi.apache.org/docs/nifi-docs/html/user-guide.html#settings
> >> >
> >> > if i receive 10 flowfiles then i need to set the priority value for
> >> > every
> >> > flow file to be unique.
> >> >
> >> > After that specify queue configuration must be
> >> > PrioritizeAttributePrioritizer.
> >> >
> >> > Then processing flowfiles based on priority value.
> >> >
> >> > How can i set priority value for seperate flow files or which
> >> > prioritizer in
> >> > Nifi to be work for my case?
> >
> >
>


How to specify priority attributes for seperate flowfiles?

2017-03-01 Thread prabhu Mahendran
I need to use PrioritizeAttributePrioritizer in NiFi.

i have observed that prioritizers in below reference.
https://nifi.apache.org/docs/nifi-docs/html/user-guide.html#settings

if i receive 10 flowfiles then i need to set the priority value for every
flow file to be unique.

After that specify queue configuration must be
PrioritizeAttributePrioritizer.

Then processing flowfiles based on priority value.

How can i set priority value for seperate flow files or which prioritizer
in Nifi to be work for my case?


How to find average of two lines in NiFi?

2017-02-21 Thread prabhu Mahendran
i need to find average of two values in seperate lines.

My Csv file loo like this.,

Name,ID,Marks
Mahi,1,90
Mahi,1,90


Andy,2,100
Andy,2,100
Now i need to store that average of 2 marks in database. "Average" column
should add two marks and divide with 2 and store that result in SQL query

Table:

Name,ID,Average
Mahi,2,90
Andy,2,100
Is this possible to find average of two value's in seperate rows using NiFi?


Re: Split csv file into multipart.

2017-02-21 Thread prabhu Mahendran
Andy,

Thank you so much for your answer.

It worked.

Thanks

On Mon, Feb 20, 2017 at 9:31 PM, Andy LoPresto <alopre...@apache.org> wrote:

> Prabhu,
>
> I answered this question and provided a template on StackOverflow [1].
>
> [1] http://stackoverflow.com/a/42353526/70465
>
> Andy LoPresto
> alopre...@apache.org
> *alopresto.apa...@gmail.com <alopresto.apa...@gmail.com>*
> PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69
>
> On Feb 20, 2017, at 6:51 AM, prabhu Mahendran <prabhuu161...@gmail.com>
> wrote:
>
> I have CSV File which having below contents,
>
> *Input.csv*
>
>  Sample NiFi Data demonstration for below
> Due dates 20-02-2017,23-03-2017
>
> My Input No1 inside csv,,
> Animals,Today-20.02.2017,Yesterday-19-02.2017
> Fox,21,32
> Lion,20,12
>
> My Input No2 inside csv
> Name,ID,City
> Mahi,12,UK
> And,21,US
>
> Prabh,32,LI
>
> I need to split above whole csv(Input.csv) into two parts like *InputNo1.csv* 
> and *InputNo2.csv*.
>
> For InputNo1.csv should have below contents only.,
>
> Animals,Today-20.02.2017,Yesterday-19-02.2017
> Fox,21,32
> Lion,20,12
>
> For InputNo2.csv should have below contents.,
>
> Name,ID,City
> Mahi,12,UK
> And,21,US
>
> Prabh,32,LI
>
> Is this possible to convert csv into Multiple parts in NiFi possible with
> existing processors?
>
>
>


Re: How to avoid this splitting of single line as multi lines in SplitText?

2017-02-15 Thread prabhu Mahendran
Andy,

I have used following properties in ReplaceText processor.

Search Value:"(.*?)(\n)(.*?)"

Replacement Value:"$1\\n$3"

Character Set:UTF-8

MaximumBuffer Size:1MB

Replacement Strategy:Regex Replace

Evaluation Mode:Entire Text


Result of this processor same as like input.It could n't perform any change.

Thanks,
prabhu

On Wed, Feb 15, 2017 at 12:35 PM, Andy LoPresto <alopre...@apache.org>
wrote:

> Prabhu,
>
> I answered this on Stack Overflow [1] but I think you could do it with
> ReplaceText before the SplitText using a regex like
>
> "(.*?)(\n)(.*?)" replaced with "$1\\n$3"
>
> [1] http://stackoverflow.com/a/42242665/70465
>
> Andy LoPresto
> alopre...@apache.org
> *alopresto.apa...@gmail.com <alopresto.apa...@gmail.com>*
> PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69
>
> On Feb 14, 2017, at 10:52 PM, Lee Laim <lee.l...@gmail.com> wrote:
>
> Prabhu,
>
> You need to remove the new lines from within the last field.  I'd
> recommend using awk in an execute stream command processor first, then
> splitting the text.  Alternatively, you could write a custom processor to
> specifically handle the incoming data.
>
> Lee
>
> On Feb 14, 2017, at 11:01 PM, prabhu Mahendran <prabhuu161...@gmail.com>
> wrote:
>
> I have CSV file which contains following line.
>
> No,NAme,ID,Description
> 1,Stack,232,"ABCDEFGHIJKLMNO
>  -- Jiuaslkm asdasdasd"
>
> used below processor structure GetFile-->SplitText
>
> In SplitText i have given header and line split count as 1.
>
> So i think it could be split row as below..,
>
>  No,NAme,ID,Description
> 1,Stack,232,"ABCDEFGHIJKLMNO
>  -- Jiuaslkm asdasdasd:"
>
> But it actually split the csv as "2" splits like below.,
>
> *First SPlit:*
>
> No,NAme,ID,Description
> 1,Stack,232,"ABCDEFGHIJKLMNO
>
> *Second Split:*
>
> No,NAme,ID,Description
> -- Jiuaslkm asdasdasd"
>
> So i have faced data handling missed something.
>
> *GOal:Now i need to handle those data lines as single line.*
>
> Any one help me to resolve this?
>
>
>


Is this possible to increase speed of processors in NiFi?

2017-02-10 Thread prabhu Mahendran
Since each processor have concurrent tasks if i have set the concurrent
tasks for processors then it boosts processing speed of processors.But it
affects the System performance such as 100% Disk Usage,100% Memory
Usage..etc

Is there is any other way to speed up processors without use concurrent
tasks?


Flowfile handling in C# is possible or not?

2017-02-07 Thread prabhu Mahendran
Since processors code written as java  which handles flow file read and
write with relationship transfer.

Is it possible for flow file handling in .net application?

Many thanks,
prabhu


Re: How to send the success status to GetFile processor?

2017-02-02 Thread prabhu Mahendran
Oleg,

Thanks for your repsonse.

Is this possible for use directory in FetchFile or any processor ?  i don't
know file name which is stored inside directory .


*${absolute.path}\${filename}*
*Note: *while using GetFile processor doesn't have upstream connections but
i need processor which fetch file inside directory using directory name
without give filename.

Many thanks

On Thu, Feb 2, 2017 at 7:37 PM, Oleg Zhurakousky <
ozhurakou...@hortonworks.com> wrote:

> Prabhu
>
> Not sure I fully understand.
> While indeed GetFile does not allow for an incoming connection, it does
> allow for your use case to happen indirectly by monitoring a predefined
> directory. So, one PG finishes and produces a file that is being put into a
> directory monitored by another PG’s GetFile.
>
> Am I missing something?
>
> Cheers
> Oleg
>
> On Feb 2, 2017, at 5:48 AM, prabhu Mahendran <prabhuu161...@gmail.com>
> wrote:
>
> Consider the below scenario:
>
> ProcessGroupA->ProcessGroupB
>
>
> Since my ProcessgroupA ends with ExecuteProcess processor that runs
> console application and save result into a directory. In ProcessGroupB, I
> will process each file in the saved directory using GetFile processor.
>
>
> Once, ProcessGroupA is completed I want to run the ProcessgroupB which
> starts with GetFile processor. Since GetFile processor doesnt't have
> upstream connection, I couldn't run the flow here. How to send the success
> status to GetFile processor?
>
>
> Note: Since I dont know the filename, FetchFile processor is not suitable
> for my case.
>
>
>


How to send the success status to GetFile processor?

2017-02-02 Thread prabhu Mahendran
Consider the below scenario:

ProcessGroupA->ProcessGroupB



Since my ProcessgroupA ends with ExecuteProcess processor that runs console
application and save result into a directory. In ProcessGroupB, I will
process each file in the saved directory using GetFile processor.



Once, ProcessGroupA is completed I want to run the ProcessgroupB which
starts with GetFile processor. Since GetFile processor doesnt't have
upstream connection, I couldn't run the flow here. How to send the success
status to GetFile processor?



Note: Since I dont know the filename, FetchFile processor is not suitable
for my case.


Re: Convert xls to csv in Nifi.

2017-01-31 Thread prabhu Mahendran
Hi jeremy,

Thanks for your information.

Many thanks,
prabhu

On Sun, Jan 29, 2017 at 3:10 AM, Jeremy Dyer <jdy...@gmail.com> wrote:

> Prabhu I do plan to add HSSF once I get some spare cycles. It might end up
> being a separate PR but I do plan to implement it.
>
> Thanks,
> Jeremy Dyer
>
> On Fri, Jan 27, 2017 at 5:50 AM, prabhu Mahendran <prabhuu161...@gmail.com
> > wrote:
>
>> Can I expect the HSSF implementation to the existing PR?
>>
>> On Fri, Jan 27, 2017 at 10:06 AM, prabhu Mahendran <
>> prabhuu161...@gmail.com> wrote:
>>
>>> Jeremy,Thanks for your information.
>>>
>>> i think you will make effort to HSSF implementation in PR for convert
>>> (XLS into CSV). Is it right?
>>>
>>> Many thanks,
>>> prabhu
>>>
>>> On Wed, Jan 25, 2017 at 7:01 PM, Jeremy Dyer <jdy...@gmail.com> wrote:
>>>
>>>> Prabhu - NIFI-2613 is currently only able to convert .xlxs documents to
>>>> csv. The processor uses Apache POI XSSF implementation which only supports
>>>> xlxs while HSSF would be pre 2007 excel files. I think to your point I
>>>> should probably make an effort to add the HSSF implementation to the
>>>> existing PR.
>>>>
>>>> - Jeremy
>>>>
>>>> On Wed, Jan 25, 2017 at 1:31 AM, prabhu Mahendran <
>>>> prabhuu161...@gmail.com> wrote:
>>>>
>>>>> Hi All,
>>>>>
>>>>> i have look into below JIRA  for conversion of my excel documents into
>>>>> csv file.
>>>>>
>>>>> https://issues.apache.org/jira/browse/NIFI-2613
>>>>>
>>>>> i have apply patches in GitHub Pull request.And then i can able to
>>>>> convert my .xlxs documents into csv file.
>>>>>
>>>>> But while give .xls documents it can't converted to csv file.
>>>>>
>>>>> Is patch applicable in jira  only for .xlxs files or it's too for all
>>>>> excel formats?
>>>>>
>>>>> Many thanks,
>>>>> prabhu
>>>>>
>>>>
>>>>
>>>
>>
>


Re: Data extraction for 100 columns is possible in NiFi?

2017-01-31 Thread prabhu Mahendran
@Mark,@Matt,@Nick

Thanks for your valuable information.

It really helpful for me.

Many thanks,
prabhu

On Mon, Jan 30, 2017 at 11:38 PM, Nick Carenza <
nick.care...@thecontrolgroup.com> wrote:

> Hey Prabhu, I just finished up a csv processing flow myself and it looks
> like this:
>
> CSV Flowfile -> InferAvroSchema -> ConvertCSVToAvro -> ConvertAvroToJson
>
> You can then use the ConvertJSONToSQL processor to finish things up.
> I would have liked to be able to go directly from CSV to JSON but I don't
> think there is a built-in processor that does that.
>
> - Nick
>
> On Mon, Jan 30, 2017 at 5:59 AM, Matt Burgess <mattyb...@apache.org>
> wrote:
>
>> Prabhu,
>>
>> I agree with Mark; if you want to use ExecuteScript for this, I have
>> an example of splitting fields (using a bar | delimiter, but you can
>> change to comma) [1].  If you have quoted values that can contain
>> commas, then like Mark said you may want to look at writing a custom
>> processor, or using a third-party library such as OpenCSV [2], I have
>> an example on including such things in an ExecuteScript configuration
>> [3].
>>
>> Regards,
>> Matt
>>
>> [1] http://funnifi.blogspot.com/2016/02/executescript-explained-
>> split-fields.html
>> [2] http://opencsv.sourceforge.net/
>> [3] http://funnifi.blogspot.com/2016/02/executescript-using-modules.html
>>
>> On Mon, Jan 30, 2017 at 8:43 AM, Mark Payne <marka...@hotmail.com> wrote:
>> > Prabhu,
>> >
>> > My guess is that you probably could find some way to do this with the
>> > standard out-of-the-box processors
>> > that come with NiFi. Perhaps by using Extract Text to extract the header
>> > columns, and then using ReplaceText
>> > and perhaps a few other processors. Going down this route though is
>> likely
>> > to be incredibly inefficient, though,
>> > and hard to understand/maintain.
>> >
>> > I think this is a great use case for either a custom processor or a
>> simple
>> > Groovy/Python script using the ExecuteScript
>> > Processor.
>> >
>> > Thanks
>> > -Mark
>> >
>> > On Jan 30, 2017, at 1:12 AM, prabhu Mahendran <prabhuu161...@gmail.com>
>> > wrote:
>> >
>> > I have a CSV data with 100 columns like below..,
>> >
>> > ,A,B,C,D,E,F,..[upto 100 Header columns]
>> > Date,A1,B1,C1,D1,E1,F1..[upto 100 Header columns]
>> > 30/01/2017 23:23:22,Majestic,32,2100.12[upto 100 data columns]
>> >
>> > In data having first 2 header lines and 3 rd line is data in which i
>> > inserted with header lines in below format.
>> >
>> > In Database insert those data with following format.
>> >
>> > insert into data values('30/01/2017 23:23:22','A','A1','Majestic');
>> > insert into data values('30/01/2017 23:23:22','B','B1,'32');
>> >
>> > Please stop me if anything i'm doing wrong.
>> >
>> > Is this possible in apache nifi?
>> >
>> > Many thanks
>> >
>> >
>>
>
>


Data extraction for 100 columns is possible in NiFi?

2017-01-29 Thread prabhu Mahendran
I have a CSV data with 100 columns like below..,

,A,B,C,D,E,F,..[upto 100 Header columns]
Date,A1,B1,C1,D1,E1,F1..[upto 100 Header columns]
30/01/2017 23:23:22,Majestic,32,2100.12[upto 100 data columns]

In data having first 2 header lines and 3 rd line is data in which i
inserted with header lines in below format.

In Database insert those data with following format.

*insert into data values('30/01/2017 23:23:22','A','A1','Majestic');*


*insert into data values('30/01/2017 23:23:22','B','B1,'32');*
Please stop me if anything i'm doing wrong.

Is this possible in apache nifi?

Many thanks


Re: How can i compare hours in which data having with current datetime hours?

2017-01-27 Thread prabhu Mahendran
Mark,

Thanks for your information.

It works for my one input.

But i have another one input which have *'07/01/2017 0-1' *column.


07/01/2017 0-1,Nick,22

.

.

07/01/2012 23-24,Sandy,35



Consider my input is as above, how to match last 48 hours record from the
current hour in nifi?

On Mon, Jan 23, 2017 at 8:23 PM, Mark Payne <marka...@hotmail.com> wrote:

> Prabhu,
>
> I think the RouteText processor will give you what you need. If you set
> the "Matching Strategy" property to "Satisfies Expression,"
> then it will allow you to use the Expression Language to evaluate each
> line of text in the file. Each line of text is available using
> the "line" variable. So, for example, you could create a Property named
> "older.than.72.hours" with a value of something like:
> ${line:getDelineatedValues(2):toDate('MM.dd.
> HH:mm:ss'):toNumber():lt( ${now():minus(25920)} )}
>
> The now():minus(25920) means subtract 25920 milliseconds (72
> hours) from the current date/time.
>
> So this will look at each line in the text file, get the second delineated
> value (by default it uses a comma as the delimiter), convert
> it to a date/time using the format of ..<4 digit
> year> <24-hour>::, and then convert
> that to a number (in milliseconds since midnight Jan. 1, 1970). It then
> compares this value to see if the value is less than the current
> date/time minus 72 hours.
>
> Each line of text that matches will be routed to 'older.than.72.hours'.
> Note, the outgoing FlowFiles may contain several lines of text
> each, though. It does not split the output into individual lines. You can
> do so if necessary via SplitText.
>
> Thanks!
> -Mark
>
>
> On Jan 23, 2017, at 5:23 AM, prabhu Mahendran <prabhuu161...@gmail.com>
> wrote:
>
> My input contains dateTime with hours like below.,
>
> *No,DateTime,Name,ID*
> *1,23.01.2016 09:02:21,Mega,201*
> *2,03.01.2016 10:02:23,Hema,202*
>
> Now i need to get 2nd Column ["02.01.2016 10:02:23"] and then compare it
> with current dateTime with milliseconds.
>
> After that if 2nd Column is in between past 72 hours and then insert into
> sql server.
>
> *For example.,*
>
>  My Current DateTime is "*04.01.2016 10:23:21"*
>
> For 1st row.,
>
>  1,*23.01.2016 09:02:21*,Mega,201
>
> =>Compare *23.01.2016 09:02:21* with 
> *04.01.2016
> 10:23:21 [Current Time].It couldn't insert into SQL Server due to *
> *DateTime*
> * is not past 72 hrs when it
>compared  with current dateTime.   *
> For 2 nd row,
>
>
>
>
>
> *2,03.01.2016 10:02:23,Hema,202=>This
> Should be insert into SQL Server.Because it's date is "03.01.2016" and it
> is past 72 hour data by compare it with current datetime. *
>
> *How can i achieve my use case in Nifi Processors?*
>
> *Please stop me if anything am doing wrong,*
>
> *Thanks,*
> *prabhu*
>
>
>


Re: Convert xls to csv in Nifi.

2017-01-27 Thread prabhu Mahendran
Can I expect the HSSF implementation to the existing PR?

On Fri, Jan 27, 2017 at 10:06 AM, prabhu Mahendran <prabhuu161...@gmail.com>
wrote:

> Jeremy,Thanks for your information.
>
> i think you will make effort to HSSF implementation in PR for convert (XLS
> into CSV). Is it right?
>
> Many thanks,
> prabhu
>
> On Wed, Jan 25, 2017 at 7:01 PM, Jeremy Dyer <jdy...@gmail.com> wrote:
>
>> Prabhu - NIFI-2613 is currently only able to convert .xlxs documents to
>> csv. The processor uses Apache POI XSSF implementation which only supports
>> xlxs while HSSF would be pre 2007 excel files. I think to your point I
>> should probably make an effort to add the HSSF implementation to the
>> existing PR.
>>
>> - Jeremy
>>
>> On Wed, Jan 25, 2017 at 1:31 AM, prabhu Mahendran <
>> prabhuu161...@gmail.com> wrote:
>>
>>> Hi All,
>>>
>>> i have look into below JIRA  for conversion of my excel documents into
>>> csv file.
>>>
>>> https://issues.apache.org/jira/browse/NIFI-2613
>>>
>>> i have apply patches in GitHub Pull request.And then i can able to
>>> convert my .xlxs documents into csv file.
>>>
>>> But while give .xls documents it can't converted to csv file.
>>>
>>> Is patch applicable in jira  only for .xlxs files or it's too for all
>>> excel formats?
>>>
>>> Many thanks,
>>> prabhu
>>>
>>
>>
>


Re: Past month data insertion is possible

2017-01-26 Thread prabhu Mahendran
Lee,

This case may not be work for me.



Actually I want like this:



Input given  -  Output Expected

01-2017  -  12-2016

05-2017  -  04-2017



If my input is the current month with year, output expected is the last
month cross checked with the year(have to consider the year also, for the
case January month).



Your answer may satisfy the less than condition for all the previous months.

On Fri, Jan 27, 2017 at 11:43 AM, Lee Laim <lee.l...@gmail.com> wrote:

> Prabhu,
>
> Using epoch time might end up being a simpler comparison.   If the
> converted date is less than
> 1483254000 (midnight of first day of current month), it is the previous
> month (for my timezone).Thanks,
> Lee
>
>
> On Thu, Jan 26, 2017 at 10:42 PM, prabhu Mahendran <
> prabhuu161...@gmail.com> wrote:
>
>> Hi Andy,
>>
>> i have already tried with your alternative solution.
>> "UpdateAttribute to add a new attribute with the previous month value,
>> and then RouteOnAttribute to determine if the flowfile should be
>> inserted "
>>
>> i have used below expression language in RouteOnAttribute,
>>
>>
>> *${literal('Jan,Feb,Mar,Apr,May,Jun,Jul,Aug,Sep,Oct,Nov,Dec'):getDelimitedField(${csv.1:toDate('dd.MM.
>> hh:mm:ss'):format('MM')}):equals(${literal('Dec,Jan,Feb,Mar,Apr,May,Jun,Jul,Aug,Sep,Oct,Nov'):getDelimitedField(${now():toDate('
>> Z MM dd HH:mm:ss.SSS '):format('MM'):toNumber()})})}*
>>
>>
>> it could be failed in below data.,
>>
>> *23.12.2015,Andy,21*
>> *23.12.2017,Present,32*
>>
>>
>> My data may contains some past years and future years
>>
>> It matches with my expression it also inserted.
>>
>> I need to check month with year in data.
>>
>> How can i check it?
>>
>>
>> On Fri, Jan 27, 2017 at 10:52 AM, Andy LoPresto <alopre...@apache.org>
>> wrote:
>>
>>> Prabhu,
>>>
>>> I answered this question with an ExecuteScript example which will do
>>> what you are looking for on Stack Overflow [1].
>>>
>>> [1] http://stackoverflow.com/a/41887397/70465
>>>
>>> Andy LoPresto
>>> alopre...@apache.org
>>> *alopresto.apa...@gmail.com <alopresto.apa...@gmail.com>*
>>> PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69
>>>
>>> On Jan 26, 2017, at 8:40 PM, prabhu Mahendran <prabhuu161...@gmail.com>
>>> wrote:
>>>
>>> Hi All,
>>>
>>> I have data in which i need to compare month of data if it is previous
>>> month then it should be insert otherwise not.
>>>
>>> *Example:*
>>>
>>> *23.12.2016 12:02:23,Koji,24*
>>> 22.01.2016 01:21:22,Mahi,24
>>>
>>> Now i need to get first column of data (23.12.2016 12:02:23) and then
>>> get month (12) on it.
>>>
>>> Compared that with before of current month like.,
>>>
>>>
>>> *If current month is 'JAN_2017',then get before of 'JAN_2017' it should
>>> be 'Dec_2016'*
>>> For First row,
>>>
>>> *compare that 'Dec_2016' with month of data 'Dec_2016' *[23.12.2016]*.*
>>>
>>> It matched then insert into database.
>>>
>>> if it not matched then ignore it.
>>>
>>> is it possible in nifi?
>>>
>>> Many thanks,
>>> prabhu
>>>
>>>
>>>
>>>
>>>
>>
>


Re: Past month data insertion is possible

2017-01-26 Thread prabhu Mahendran
Hi Andy,

i have already tried with your alternative solution.
"UpdateAttribute to add a new attribute with the previous month value, and
then RouteOnAttribute to determine if the flowfile should be inserted "

i have used below expression language in RouteOnAttribute,


*${literal('Jan,Feb,Mar,Apr,May,Jun,Jul,Aug,Sep,Oct,Nov,Dec'):getDelimitedField(${csv.1:toDate('dd.MM.
hh:mm:ss'):format('MM')}):equals(${literal('Dec,Jan,Feb,Mar,Apr,May,Jun,Jul,Aug,Sep,Oct,Nov'):getDelimitedField(${now():toDate('
Z MM dd HH:mm:ss.SSS '):format('MM'):toNumber()})})}*


it could be failed in below data.,

*23.12.2015,Andy,21*
*23.12.2017,Present,32*


My data may contains some past years and future years

It matches with my expression it also inserted.

I need to check month with year in data.

How can i check it?


On Fri, Jan 27, 2017 at 10:52 AM, Andy LoPresto <alopre...@apache.org>
wrote:

> Prabhu,
>
> I answered this question with an ExecuteScript example which will do what
> you are looking for on Stack Overflow [1].
>
> [1] http://stackoverflow.com/a/41887397/70465
>
> Andy LoPresto
> alopre...@apache.org
> *alopresto.apa...@gmail.com <alopresto.apa...@gmail.com>*
> PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69
>
> On Jan 26, 2017, at 8:40 PM, prabhu Mahendran <prabhuu161...@gmail.com>
> wrote:
>
> Hi All,
>
> I have data in which i need to compare month of data if it is previous
> month then it should be insert otherwise not.
>
> *Example:*
>
> *23.12.2016 12:02:23,Koji,24*
> 22.01.2016 01:21:22,Mahi,24
>
> Now i need to get first column of data (23.12.2016 12:02:23) and then get
> month (12) on it.
>
> Compared that with before of current month like.,
>
>
> *If current month is 'JAN_2017',then get before of 'JAN_2017' it should be
> 'Dec_2016'*
> For First row,
>
> *compare that 'Dec_2016' with month of data 'Dec_2016' *[23.12.2016]*.*
>
> It matched then insert into database.
>
> if it not matched then ignore it.
>
> is it possible in nifi?
>
> Many thanks,
> prabhu
>
>
>
>
>


Past month data insertion is possible

2017-01-26 Thread prabhu Mahendran
Hi All,

I have data in which i need to compare month of data if it is previous
month then it should be insert otherwise not.

*Example:*

*23.12.2016 12:02:23,Koji,24*
22.01.2016 01:21:22,Mahi,24

Now i need to get first column of data (23.12.2016 12:02:23) and then get
month (12) on it.

Compared that with before of current month like.,


*If current month is 'JAN_2017',then get before of 'JAN_2017' it should be
'Dec_2016'*
For First row,

*compare that 'Dec_2016' with month of data 'Dec_2016' *[23.12.2016]*.*

It matched then insert into database.

if it not matched then ignore it.

is it possible in nifi?

Many thanks,
prabhu


Re: Convert xls to csv in Nifi.

2017-01-26 Thread prabhu Mahendran
Jeremy,Thanks for your information.

i think you will make effort to HSSF implementation in PR for convert (XLS
into CSV). Is it right?

Many thanks,
prabhu

On Wed, Jan 25, 2017 at 7:01 PM, Jeremy Dyer <jdy...@gmail.com> wrote:

> Prabhu - NIFI-2613 is currently only able to convert .xlxs documents to
> csv. The processor uses Apache POI XSSF implementation which only supports
> xlxs while HSSF would be pre 2007 excel files. I think to your point I
> should probably make an effort to add the HSSF implementation to the
> existing PR.
>
> - Jeremy
>
> On Wed, Jan 25, 2017 at 1:31 AM, prabhu Mahendran <prabhuu161...@gmail.com
> > wrote:
>
>> Hi All,
>>
>> i have look into below JIRA  for conversion of my excel documents into
>> csv file.
>>
>> https://issues.apache.org/jira/browse/NIFI-2613
>>
>> i have apply patches in GitHub Pull request.And then i can able to
>> convert my .xlxs documents into csv file.
>>
>> But while give .xls documents it can't converted to csv file.
>>
>> Is patch applicable in jira  only for .xlxs files or it's too for all
>> excel formats?
>>
>> Many thanks,
>> prabhu
>>
>
>


Convert xls to csv in Nifi.

2017-01-24 Thread prabhu Mahendran
Hi All,

i have look into below JIRA  for conversion of my excel documents into csv
file.

https://issues.apache.org/jira/browse/NIFI-2613

i have apply patches in GitHub Pull request.And then i can able to convert
my .xlxs documents into csv file.

But while give .xls documents it can't converted to csv file.

Is patch applicable in jira  only for .xlxs files or it's too for all excel
formats?

Many thanks,
prabhu


How can i compare hours in which data having with current datetime hours?

2017-01-23 Thread prabhu Mahendran
My input contains dateTime with hours like below.,

*No,DateTime,Name,ID*
*1,23.01.2016 09:02:21,Mega,201*
*2,03.01.2016 10:02:23,Hema,202*

Now i need to get 2nd Column ["02.01.2016 10:02:23"] and then compare it
with current dateTime with milliseconds.

After that if 2nd Column is in between past 72 hours and then insert into
sql server.

*For example.,*

 My Current DateTime is "*04.01.2016 10:23:21"*

For 1st row.,

 1,*23.01.2016 09:02:21*,Mega,201

=>Compare *23.01.2016 09:02:21*
with *04.01.2016
10:23:21 [Current Time].It couldn't insert into SQL Server due to *
*DateTime*
* is not past 72 hrs when it
   compared  with current dateTime.  *
For 2 nd row,





*2,03.01.2016 10:02:23,Hema,202   =>This
Should be insert into SQL Server.Because it's date is "03.01.2016" and it
is past 72 hour data by compare it with current datetime.*

*How can i achieve my use case in Nifi Processors?*

*Please stop me if anything am doing wrong,*

*Thanks,*
*prabhu*


Re: How to download HTTP Content with multiple years in GetHTTP Processor or any processor?

2017-01-23 Thread prabhu Mahendran
Hi  Koji,

Thanks for your information.

Many thanks,
prabhu

On Mon, Jan 23, 2017 at 2:35 PM, Koji Kawamura <ijokaruma...@gmail.com>
wrote:

> Hi Prabhu,
>
> GetHTTP doesn't take input relationship, so I'd recommend to use
> InvokeHTTP instead.
>
> With UpdateAttribute and RouteOnAttribute, you can create a loop in
> NiFi flow. An example is available here:
> https://gist.github.com/ijokarumawak/01c4fd2d9291d3e74ec424a581659ca8
>
> The loop counter can be used as year parameter, with NiFi Attribute
> Expression Language in InvokeHTTP processor's Remote URL, e.g:
> http://example.com/?year=${name-of-loop-counter}.
>
> Thanks,
> Koji
>
> On Mon, Jan 23, 2017 at 3:28 PM, prabhu Mahendran
> <prabhuu161...@gmail.com> wrote:
> > I have HTTP_URL which having file in year basis(2000-2010).
> >
> > In GetHTTP Processor(NiFi-0.6.1), i have given below to url to download
> data
> > for particular year like below.,
> >
> > http:\\myurl.com\year=2000
> >
> > It can download the file which i need in particular year 2000.
> >
> > But now my case i need to download files in years from 2000 to 2010.
> >
> > So i need to download all data's in single GetHTTP Processor.If i use
> > multiple processors by give year with 2000 to 2010 then it works.
> >
> > I have to use single processor which updates year in url from 2000-2010
> to
> > download data's in year wise.
> >
> > Is this possible to download content across multiple years in single
> > processor?
> >
> > Please stop me if anything am doing wrong.
>


How to download HTTP Content with multiple years in GetHTTP Processor or any processor?

2017-01-22 Thread prabhu Mahendran
I have HTTP_URL which having file in year basis(2000-2010).

In GetHTTP Processor(NiFi-0.6.1), i have given below to url to download
data for particular year like below.,

*http:\\myurl.com \year=2000*

It can download the file which i need in particular year 2000.

But now my case i need to download files in years from 2000 to 2010.

So i need to download all data's in single GetHTTP Processor.If i use
multiple processors by give year with 2000 to 2010 then it works.

I have to use single processor which updates year in url from 2000-2010 to
download data's in year wise.

Is this possible to download content across multiple years in single
processor?

Please stop me if anything am doing wrong.


Re: How backup works when flow.xml size more than max storage?

2017-01-19 Thread prabhu Mahendran
Hi Koji,

Both simulation looks perfect. I was expected this exact behavior and it
matches my requirement, also it sounds logical. Shall I expect this changes
in next nifi release version??


Thank you so much for this tremendous support.


On Fri, Jan 20, 2017 at 6:14 AM, Koji Kawamura <ijokaruma...@gmail.com>
wrote:

> Hi Prabhu,
>
> In that case, yes, as your assumption, even the latest archive exceeds
> 500MB, the latest archive is saved, as long as it was written to disk
> successfully.
>
> After that, when user updates NiFi flow, before new archive is
> created, the previous one will be removed, because max.storage
> exceeds. Then the latest will be archived.
>
> Let's simulate the scenario with the to-be-updated logic by NIFI-3373,
> in which the size of flow.xml keeps increasing:
>
> # CASE-1
>
> archive.max.storage=10MB
> archive.max.count = 5
>
> Time | flow.xml | archives | archive total |
> t1 | f1 5MB  | f1 | 5MB
> t2 | f2 5MB  | f1, f2 | 10MB
> t3 | f3 5MB  | f1, f2, f3 | 15MB
> t4 | f4 10MB | f2, f3, f4 | 20MB
> t5 | f5 15MB | f4, f5 | 25MB
> t6 | f6 20MB | f6 | 20MB
> t7 | f7 25MB | t7 | 25MB
>
> * t3: f3 can is archived even total exceeds 10MB. Because f1 + f2 <=
> 10MB. WAR message starts to be logged from this point, because total
> archive size > 10MB.
> * t4: The oldest f1 is removed, because f1 + f2 + f3 > 10MB.
> * t5: Even if flow.xml size exceeds max.storage, the latest archive is
> created. f4 are kept because f4 <= 10MB.
> * t6: f4 and f5 are removed because f4 + f5 > 10MB, and also f5 > 10MB.
>
> In this case, NiFi will keep logging WAR (or should be ERR??) message
> indicating archive storage size is exceeding limit, from t3.
> After t6, even if archive.max.count = 5, NiFi will only keep the
> latest flow.xml.
>
> # CASE-2
>
> If you'd like to keep at least 5 archives no matter what, then set
> blank max.storage and max.time.
>
> archive.max.storage=
> archive.max.time=
> archive.max.count = 5 // Only limit archives by count
>
> Time | flow.xml | archives | archive total |
> t1 | f1 5MB  | f1 | 5MB
> t2 | f2 5MB  | f1, f2 | 10MB
> t3 | f3 5MB  | f1, f2, f3 | 15MB
> t4 | f4 10MB | f1, f2, f3, f4 | 25MB
> t5 | f5 15MB | f1, f2, f3, f4, f5 | 40MB
> t6 | f6 20MB | f2, f3, f4, f5, f6 | 55MB
> t7 | f7 25MB | f3, f4, f5, f6, (f7) | 50MB, (75MB)
> t8 | f8 30MB | f3, f4, f5, f6 | 50MB
>
> * From t6, oldest archive is removed to keep number of archives <= 5
> * At t7, if the disk has only 60MB space, f7 won't be archived. And
> after this point, archive mechanism stop working (Trying to create new
> archive, but keep getting exception: no space left on device).
>
> In either case above, once flow.xml has grown to that size, some human
> intervention would be needed.
> Do those simulation look reasonable?
>
> Thanks,
> Koji
>
> On Thu, Jan 19, 2017 at 5:48 PM, prabhu Mahendran
> <prabhuu161...@gmail.com> wrote:
> > Hi Koji,
> >
> > Thanks for your information.
> >
> > Actually the task description looks fine. I have one question here,
> consider
> > the storage limit is 500MB, suppose my latest workflow exceeds this
> limit,
> > which behavior is performed with respect to the properties(max.count,
> > max.time and max.storage)?? In my assumption latest archive is saved
> even it
> > exceeds 500MB, so what happen from here? Either it will keep on save the
> > single latest archive with the large size or it will notify the user to
> > increase the size and preserves the latest file till we restart the
> flow??
> > If so what happens if the size is keep on increasing with respect to
> 500MB,
> > it will save archive based on count or only latest archive throughtout
> nifi
> > is in running status??
> >
> > Many thanks
> >
> > On Thu, Jan 19, 2017 at 12:47 PM, Koji Kawamura <ijokaruma...@gmail.com>
> > wrote:
> >>
> >> Hi Prabhu,
> >>
> >> Thank you for the suggestion.
> >>
> >> Keeping latest N archives is nice, it's simple :)
> >>
> >> The max.time and max.storage have other benefit and since already
> >> released, we should keep existing behavior with these settings, too.
> >> I've created a JIRA to add archive.max.count property.
> >> https://issues.apache.org/jira/browse/NIFI-3373
> >>
> >> Thanks,
> >> Koji
> >>
> >> On Thu, Jan 19, 2017 at 2:21 PM, prabhu Mahendran
> >> <prabhuu161...@gmail.com> wrote:
> >> > Hi Koji,
> >> >
> >> >
> >> > Thanks for your reply,
> >> >
> >>

Re: How backup works when flow.xml size more than max storage?

2017-01-19 Thread prabhu Mahendran
Hi Koji,

Thanks for your information.

Actually the task description looks fine. I have one question here,
consider the storage limit is 500MB, suppose my latest workflow exceeds
this limit, which behavior is performed with respect to the
properties(max.count, max.time and max.storage)?? In my assumption latest
archive is saved even it exceeds 500MB, so what happen from here? Either it
will keep on save the single latest archive with the large size or it will
notify the user to increase the size and preserves the latest file till we
restart the flow?? If so what happens if the size is keep on increasing
with respect to 500MB, it will save archive based on count or only latest
archive throughtout nifi is in running status??

Many thanks

On Thu, Jan 19, 2017 at 12:47 PM, Koji Kawamura <ijokaruma...@gmail.com>
wrote:

> Hi Prabhu,
>
> Thank you for the suggestion.
>
> Keeping latest N archives is nice, it's simple :)
>
> The max.time and max.storage have other benefit and since already
> released, we should keep existing behavior with these settings, too.
> I've created a JIRA to add archive.max.count property.
> https://issues.apache.org/jira/browse/NIFI-3373
>
> Thanks,
> Koji
>
> On Thu, Jan 19, 2017 at 2:21 PM, prabhu Mahendran
> <prabhuu161...@gmail.com> wrote:
> > Hi Koji,
> >
> >
> > Thanks for your reply,
> >
> > Yes. Solution B may meet as I required. Currently if the storage size
> meets,
> > complete folder is getting deleted and the new flow is not tracked in the
> > archive folder. This behavior is the drawback here. I need atleast last
> > workflow to be saved in the archive folder and notify the user to
> increase
> > the size. At the same time till nifi restarts, atleast last complete
> > workflow should be backed up.
> >
> >
> > My another suggestion is as follows:
> >
> >
> > Regardless of the max.time and max.storage property, Can we have only few
> > files in archive(consider only 10 files). Each action from the nifi
> canvas
> > should be tracked here, if the flow.xml.gz archive files count reaches it
> > should delete the old first file and save the latest file, so that the
> count
> > 10 is maintained. Here we can maintain the workflow properly and backup
> is
> > also achieved without confusing with max.time and max.storage. Only case
> is
> > that the disk size exceeds, we should notify user about this.
> >
> >
> > Many thanks.
> >
> >
> > On Thu, Jan 19, 2017 at 6:36 AM, Koji Kawamura <ijokaruma...@gmail.com>
> > wrote:
> >>
> >> Hi Prabhu,
> >>
> >> Thanks for sharing your experience with flow file archiving.
> >> The case that a single flow.xml.gz file size exceeds
> >> archive.max.storage was not considered well when I implemented
> >> NIFI-2145.
> >>
> >> By looking at the code, it currently works as follows:
> >> 1. The original conf/flow.xml.gz (> 1MB) is archived to conf/archive
> >> 2. NiFi checks if there's any expired archive files, and delete it if
> any
> >> 3. NiFi checks if the total size of all archived files, then delete
> >> the oldest archive. Keep doing this until the total size becomes less
> >> than or equal to the configured archive.max.storage.
> >>
> >> In your case, at step 3, the newly created archive is deleted, because
> >> its size was grater than archive.max.storage.
> >> In this case, NiFi only logs INFO level message, and it's hard to know
> >> what happened from user, as you reported.
> >>
> >> I'm going to create a JIRA for this, and fix current behavior by
> >> either one of following solutions:
> >>
> >> A. treat archive.max.storage as a HARD limit. If the original
> >> flow.xml.gz exceeds configured archive.max.storage in size, then throw
> >> an IOException, which results a WAR level log message "Unable to
> >> archive flow configuration as requested due to ...".
> >>
> >> B. treat archive.max.storage as a SOFT limit. By not including the
> >> newly created archive file at the step 2 and 3 above, so that it can
> >> stay there. Maybe a WAR level log message should be logged.
> >>
> >> For greater user experience, I'd prefer solution B, so that it can be
> >> archived even the flow.xml.gz exceeds archive storage size, since it
> >> was able to be written to disk, which means the physical disk had
> >> enough space.
> >>
> >> How do you think?
> >>
> >> Thanks!
> >> Koji
> >>
> >> On Wed, J

Re: How backup works when flow.xml size more than max storage?

2017-01-18 Thread prabhu Mahendran
Hi Koji,


Thanks for your reply,

Yes. Solution B may meet as I required. Currently if the storage size
meets, complete folder is getting deleted and the new flow is not tracked
in the archive folder. This behavior is the drawback here. I need atleast
last workflow to be saved in the archive folder and notify the user to
increase the size. At the same time till nifi restarts, atleast last
complete workflow should be backed up.


My another suggestion is as follows:


Regardless of the max.time and max.storage property, Can we have only few
files in archive(consider only 10 files). Each action from the nifi canvas
should be tracked here, if the flow.xml.gz archive files count reaches it
should delete the old first file and save the latest file, so that the
count 10 is maintained. Here we can maintain the workflow properly and
backup is also achieved without confusing with max.time and max.storage.
Only case is that the disk size exceeds, we should notify user about this.


Many thanks.

On Thu, Jan 19, 2017 at 6:36 AM, Koji Kawamura <ijokaruma...@gmail.com>
wrote:

> Hi Prabhu,
>
> Thanks for sharing your experience with flow file archiving.
> The case that a single flow.xml.gz file size exceeds
> archive.max.storage was not considered well when I implemented
> NIFI-2145.
>
> By looking at the code, it currently works as follows:
> 1. The original conf/flow.xml.gz (> 1MB) is archived to conf/archive
> 2. NiFi checks if there's any expired archive files, and delete it if any
> 3. NiFi checks if the total size of all archived files, then delete
> the oldest archive. Keep doing this until the total size becomes less
> than or equal to the configured archive.max.storage.
>
> In your case, at step 3, the newly created archive is deleted, because
> its size was grater than archive.max.storage.
> In this case, NiFi only logs INFO level message, and it's hard to know
> what happened from user, as you reported.
>
> I'm going to create a JIRA for this, and fix current behavior by
> either one of following solutions:
>
> A. treat archive.max.storage as a HARD limit. If the original
> flow.xml.gz exceeds configured archive.max.storage in size, then throw
> an IOException, which results a WAR level log message "Unable to
> archive flow configuration as requested due to ...".
>
> B. treat archive.max.storage as a SOFT limit. By not including the
> newly created archive file at the step 2 and 3 above, so that it can
> stay there. Maybe a WAR level log message should be logged.
>
> For greater user experience, I'd prefer solution B, so that it can be
> archived even the flow.xml.gz exceeds archive storage size, since it
> was able to be written to disk, which means the physical disk had
> enough space.
>
> How do you think?
>
> Thanks!
> Koji
>
> On Wed, Jan 18, 2017 at 3:27 PM, prabhu Mahendran
> <prabhuu161...@gmail.com> wrote:
> > i have check below properties used for the backup operations in
> Nifi-1.0.0
> > with respect to JIRA.
> >
> > https://issues.apache.org/jira/browse/NIFI-2145
> >
> > nifi.flow.configuration.archive.max.time=1 hours
> > nifi.flow.configuration.archive.max.storage=1 MB
> >
> > Since we have two backup operations first one is "conf/flow.xml.gz" and
> > "conf/archive/flow.xml.gz"
> >
> > I have saved archive workflows(conf/archive/flow.xml.gz) as per hours in
> > "max.time" property.
> >
> > At particular time i have reached "1 MB"[set as size of default storage].
> >
> > So it will delete existing conf/archive/flow.xml.gz completely and
> doesn't
> > write new flow files in conf/archive/flow.xml.gz due to size exceeds.
> >
> > No logs has shows that new flow.xml.gz has higher size than specified
> > storage.
> >
> > Can we able to
> >
> > Why it could delete existing flows and doesn't write new flows due to
> > storage?
> >
> > In this case in one backup operation has failed or not?
> >
> > Thanks,
> >
> > prabhu
>


How backup works when flow.xml size more than max storage?

2017-01-17 Thread prabhu Mahendran
i have check below properties used for the backup operations in Nifi-1.0.0
with respect to JIRA.

https://issues.apache.org/jira/browse/NIFI-2145

*nifi.flow.configuration.archive.max.time=1 hours*

*nifi.flow.configuration.archive.max.storage=1 MB*
Since we have two backup operations first one is *"conf/flow.xml.gz"* and
*"conf/archive/flow.xml.gz"*

I have saved archive workflows(conf/archive/flow.xml.gz) as per hours in
"max.time" property.

At particular time i have reached "1 MB"[set as size of default storage].

So it will delete existing *conf/archive/flow.xml.gz* completely and
doesn't write new flow files in *conf/archive/flow.xml.gz* due to size
exceeds.

No logs has shows that new flow.xml.gz has higher size than specified
storage.

Can we able to

*Why it could delete existing flows and doesn't write new flows due to
storage?*

In this case in one backup operation has failed or not?

Thanks,

prabhu


Re: Delimiter splitting in ExtractText possible?

2016-11-24 Thread prabhu Mahendran
Hi folks,

@jason -->Thank you so much for your suggestions it really helpful for us.

@Joe-->
I am having csv data which having ',' as seperator and move that into SQL
Server.

And i just need quickest way to extract all unstructured data by using
common regex or using delimiter of csv file.

So i think ',' as delimiter it will split the data as data.1,data.2..upto
number of columns in file by using comma in ExtractText processor.

But jason give common regex ([^,]*?),([^,]*),  to split the data .It
could be useful for me.Moreover this regex very expensive for perform
pattern mapping.

If i use that regex then sometimes it shows "Java Heap Space error" in all
ReplaceText,UpdateAttribute Processors.

*Is there is any way to split the data by using separator like , or |?*

Because all file having some delimiter,If i give delimiter in processor
then it will extract the rows according to the data.1,data.2,..etc

So i have given ',' as new attribute value in ExtractText attribute.It
shows validation error,

Is there is any other way to extract *csv *data by using seperator of the
file?







On Wed, Nov 23, 2016 at 8:57 PM, Joe Witt <joe.w...@gmail.com> wrote:

> Jason
>
> That was an excellent response.
>
> Prabhu - i think the question is what would you like to do with the
> data?  Are you going to transform it then send it somewhere?  Do you
> want to be able to filter some rows out then send the rest?  Can you
> describe that part more?
>
> The general pattern here is
>
> It is certainly easy enough to do the two-phase split to maintain
> efficiency
>
> SplitText (500 line chunks for example)
> SplitText (single line chunks)
> ?? - what do you want to accomplish per line?
> ?? - where is the data going?
>
> Thanks
> Joe
>
> On Wed, Nov 23, 2016 at 9:41 AM, Jason Tarasovic
> <jasontaraso...@mobilgov.com> wrote:
> > Prabhu,
> >
> > It's possible to do what you're asking but not especially efficient. You
> can
> > SplitText twice (10,000 and then 1) outputting the header on each and
> then
> > running the result through ExtractText. Your regex would be something
> like
> > ([^,]*?),([^,]*), so match 0 or more non-comma characters followed
> by a
> > comma. ExtractText will place the matched capture groups into attributes
> > like you mentioned (date.1->the_captured_text)
> >
> > However, it's not super efficient or at least it hasn't been in my case
> as
> > you're moving the FlowFile contents into attributes and the attributes
> are
> > stored in memory so, depending on how large the file is, you *may*
> > experience excessive GC activity or OOM errors.
> >
> > Using InferAvroSchema (if you don't know the schema in advance) and then
> > using ConvertCSVtoAvro may be better option depending on where the data
> is
> > ultimately going. One caveat though is that ConvertCSVtoAvro seems to
> only
> > work with properly quoted and escaped CSV that conforms to RFC 4180.
> >
> > I'm just getting started with NiFi myself so not an expert or anything
> but I
> > hope that helps.
> >
> > -Jason
> >
> > On Tue, Nov 22, 2016 at 3:34 AM, prabhu Mahendran <
> prabhuu161...@gmail.com>
> > wrote:
> >>
> >> Hi All,
> >>
> >> I have CSV unstructured data with comma as delimiter which contains 100
> >> rows.
> >>
> >> Is it possible to extract the data's in csv file using comma as
> seperator
> >> in nifi processors.
> >>
> >>
> >> See my Sample data 3 from 100 rows.
> >>
> >> No,Name,Age,PAN,City
> >> 1,Siva,22,91230,Londan,
> >> 2,,23,91231,UK
> >> 3,Greck,22,,US
> >>
> >>
> >> In 1st row having all values which can be seperated by "data" attribute
> >> having regex (.+),(.+),(.+),(.+),(.+) then row will be split like
> below..,
> >>
> >> data.1-->1
> >> data.2-->Siva
> >> data.3-->22
> >> data.4-->91230
> >> data.5-->Londan
> >>
> >> But in Second row which having Empty values in Name column can using
> regex
> >> (.+),,(.+),(.+),(.+) then row will be split like below..,
> >>
> >>data.1-->2
> >>data.2-->23
> >>data.3-->91231
> >>data.4-->UK
> >>
> >> Third row same as PAN Column empty it can able to split using another
> >> regex attribute.
> >>
> >> But my problem is now data having 100 rows.In future this may having
> >> another 100 rows.So again need to write more regex attributes to capture
> >> group wise .
> >>
> >>
> >> So I think  i have given comma(,) as common regex for all rows in csv
> file
> >> then it will split data as into data.1,data.2,...data.5
> >>
> >> But i gets an validation failed error in Bulletins Indicator in
> >> ExtractTextProcessor.
> >>
> >> So is this possible to write delimiter wise splitting of rows in CSV
> File?
> >>
> >> Is this possible to write common regex for all csv data in ExtractText
> or
> >> any other processor?
> >>
> >
>


Delimiter splitting in ExtractText possible?

2016-11-22 Thread prabhu Mahendran
Hi All,

I have CSV unstructured data with comma as delimiter which contains 100
rows.

Is it possible to extract the data's in csv file using comma as seperator
in nifi processors.


*See my Sample data 3 from 100 rows.*

*No,Name,Age,PAN,City*
*1,Siva,22,91230,Londan,*
*2,,23,91231,UK*

*3,Greck,22,,US*

In 1st row having all values which can be seperated by "data" attribute
having regex *(.+),(.+),(.+),(.+),(.+)* then row will be split like below..,

data.1-->1
data.2-->Siva
data.3-->22
data.4-->91230
data.5-->Londan

But in Second row which having Empty values in Name column can using regex
(.+),,(.+),(.+),(.+) then row will be split like below..,

   data.1-->2
   data.2-->23
   data.3-->91231
   data.4-->UK

Third row same as PAN Column empty it can able to split using another regex
attribute.

But my problem is now data having 100 rows.In future this may having
another 100 rows.So again need to write more regex attributes to capture
group wise .


*So I think  i have given comma(,) as common regex for all rows in csv file
then it will split data as into data.1,data.2,...data.5 *





*But i gets an validation failed error in Bulletins Indicator in
ExtractTextProcessor.So is this possible to write delimiter wise splitting
of rows in CSV File?Is this possible to write common regex for all csv data
in ExtractText or any other processor?*


Re: How to increase the processing speed of the ExtractText and ReplaceText Processor?

2016-10-20 Thread prabhu Mahendran
Lee,

I have tried your suggested flow which can able to insert the data into sql
server in 50 minutes And it also take long time.

*==>*your Query:
*You might be processing the entire dat file (instead of a single row) for
each record.*
  How can i process entire dat file into SQL Server?


*==>Query:Without any new optimizations you'll need ~25 threads and
sufficient memory to feed the threads.*
  My processors runs in 10 threads only by setting concurrent threads,How
to increase it to be 25 threads.

If you try quick test then please share "what is regex which you have used?"

Is there any other processor having functionality like extract text?

Thanks



On Wed, Oct 19, 2016 at 11:29 PM, Lee Laim <lee.l...@gmail.com> wrote:

> Prabu,
>
> In order to move 3M rows in 10 minutes, you'll need to process 5000
> rows/second.
> During your 4 hour run, you were processing ~200 rows/second.
>
> Without any new optimizations you'll need ~25 threads and sufficient
> memory to feed the threads.  I agree with Mark and you should be able to
> get far more than 200 rows/second.
>
> I ran a quick test using your ExtractText regex on similar data I was able
> to process over 100,000 rows/minute through the extract text processor.
> The input data was a single row of 4 fields delimited by the "|" symbol.
>
> *You might be processing the entire dat file (instead of a single row) for
> each record.*
> *Can you check the flow file attributes and content going into
> ExtractText?  *
>
>
> Here is the flow with some notes:
>
> 1.GetFile (a 30 MB .dat file consisting of 3M rows; each row is about 10
> bytes)
>
> 2 SplitText -> SplitText  (to break the 3M rows down to manageable chunks
> of 10,000 lines per flow file, then split again to 1 line per flow file)
>
> 3. ExtractText to extract the 4 fields
>
> 4. ReplaceText to generate json (You can alternatively use
> AttributesToJson here)
>
> 5. ConvertJSONtoSQL
>
> 6. PutSQL - (This should be true bottleneck; Index the DB well and use
> many threads)
>
>
> If my assumptions are incorrect, please let me know.
>
> Thanks,
> Lee
>
> On Thu, Oct 20, 2016 at 1:43 AM, Kevin Verhoeven <
> kevin.verhoe...@ds-iq.com> wrote:
>
>> I’m not clear on how much data you are processing, does the data(.dat)
>> file have 3,00,000 rows?
>>
>>
>>
>> Kevin
>>
>>
>>
>> *From:* prabhu Mahendran [mailto:prabhuu161...@gmail.com]
>> *Sent:* Wednesday, October 19, 2016 2:05 AM
>> *To:* users@nifi.apache.org
>> *Subject:* Re: How to increase the processing speed of the ExtractText
>> and ReplaceText Processor?
>>
>>
>>
>> Mark,
>>
>> Thanks for the response.
>>
>> My Sample input data(.dat) like below..,
>>
>> 1|2|3|4
>> 6|7|8|9
>> 11|12|13|14
>>
>> In Extract Text,i have add input row only with addition of default
>> properties like below screenshot.
>>
>> [image: Inline image 1]
>> In Replace text ,
>>
>> just replace value like {"data1":"${inputrow.1}","data
>> 2":"${inputrow.2}","data3":"${inputrow.3}","data4":"${inputrow.4}"}
>> [image: Inline image 2]
>>
>>
>> Here there is no bulletins indicates back pressure on processors.
>>
>> Can i know prerequisites needed for move the 3,00,000 data into sql
>> server in duration 10-20 minutes?
>> What are the number of CPU' s needed?
>> How much heap size and perm gen size we need to set for move that data
>> into sql server?
>>
>> Thanks
>>
>>
>>
>> On Tue, Oct 18, 2016 at 7:05 PM, Mark Payne <marka...@hotmail.com> wrote:
>>
>> Prabhu,
>>
>>
>>
>> Thanks for the details. All of this seems fairly normal. Given that you
>> have only a single core,
>>
>> I don't think multiple concurrent tasks will help you. Can you share your
>> configuration for ExtractText
>>
>> and ReplaceText? Depending on the regex'es being used, they can be
>> extremely expensive to evaluate.
>>
>> The regex that you mentioned in the other email -
>> "(.+)[|](.+)[|](.+)[|](.+)" is in fact extremely expensive.
>>
>> Any time that you have ".*" or ".+" in your regex, it is going to be
>> extremely expensive, especially with
>>
>> longer FlowFile content.
>>
>>
>>
>> Also, do you see any bulletins indicating that the provenance repository
>> is applying backpressure? Given
>>
>> that you are splitting your 

How to increase the processing speed of the ExtractText and ReplaceText Processor?

2016-10-17 Thread prabhu Mahendran
Hi All,

I have tried to perform the below operation.

dat file(input)-->JSON-->SQL-->SQLServer


GetFile-->SplitText-->SplitText-->ExtractText-->ReplaceText-->ConvertJsonToSQL-->PutSQL.

My Input File(.dat)-->3,00,000 rows.

*Objective:* Move the data from '.dat' file into SQLServer.

I can able to Store the data in SQL Server by using combination of above
processors.But it takes almost 4-5 hrs to move complete data into SQLServer.

Combination of SplitText's perform data read quickly.But Extract Text takes
long time to pass given data matches with user defined expression.If input
comes 107 MB but it send outputs in KB size only even ReplaceText processor
also processing data in KB Size only.

In accordance with above slow processing leads the more time taken for data
into SQLsever.


Extract Text,ReplaceText,ConvertJsonToSQL processors send's outgoing flow
file in Kilobytes only.

If i have specify concurrent tasks for those
ExtractText,ReplaceText,ConvertJsonToSQL then it occupy the 100% cpu and
disk usage.

It just 30 MB data ,But processors takes 6 hrs for data movement into
SQLServer.

Faced Problem is..,


   1.Almost 6 hrs taken for move the 3lakhs data into SQL Server.
   2.ExtractText,ReplaceText take long time for processing data(it
   send output flowfile kb size only).

Can anyone help me to solve below *requirement*?

Need to reduce the number of time taken by the processors for move the
lakhs of data into SQL Server.



If anything i'm done wrong,please help me to done it right.


Regarding ConsumeIMAP Processor.

2016-09-20 Thread prabhu Mahendran
Hi,

I am new to the NIFI. I have just use Consume IMAP Processor to retrieve
attachement from mail Server.

If i use it then i can able to download attachement but that document
having MIME type information with addition of EMail Data like below
screenshot.


I need to extract the exact data only but this data comes with some MIME
information.

Can anyone please help me to extract data only or remove the MIME
information from file?

Thanks,


Re: Large dataset on hbase

2016-04-13 Thread prabhu Mahendran
Hi,

1.Is the output of your Pig script a single file that contains all the JSON
documents corresponding to your CSV?

Yes output of my pig script having all json documents corresponding to the
CSV.

2.Also, are there any errors in logs/nifi-app.log (or on the processor in
the UI) when this happens?

Here there are no errors in both web interface(UI) and logs/nifi-app.log
file.


Thanks,

Prabhu Mahendran


On 12-Apr-2016 8:20 pm, "Bryan Bende" <bbe...@gmail.com> wrote:

>
> Is the output of your Pig script a single file that contains all the JSON
> documents corresponding to your CSV?
> or does it create a single JSON document for each row of the CSV?
>
> Also, are there any errors in logs/nifi-app.log (or on the processor in
> the UI) when this happens?
>
> -Bryan
>
> On Tue, Apr 12, 2016 at 12:38 PM, prabhu Mahendran <
> prabhuu161...@gmail.com> wrote:
>
>> Hi,
>>
>> I just use Pig Script to convert the CSV into JSON with help of
>> ExecuteProcess.
>>
>> In my case i have use n1 from JSON document which could be stored as row
>> key in HBase Table.So n2-n22 store as columns in hbase.
>>
>> some of rows (n1's) are stored inside the table but remaining are read
>> well but not stored.
>>
>> Thanks,
>> Prabhu Mahendran
>>
>> On Tue, Apr 12, 2016 at 1:58 PM, Bryan Bende <bbe...@gmail.com> wrote:
>>
>>> Hi Prabhu,
>>>
>>> How did you end up converting your CSV into JSON?
>>>
>>> PutHBaseJSON creates a single row from a JSON document. In your example
>>> above, using n1 as the rowId, it would create a row with columns n2 - n22.
>>> Are you seeing columns missing, or are you missing whole rows from your
>>> original CSV?
>>>
>>> Thanks,
>>>
>>> Bryan
>>>
>>>
>>>
>>> On Mon, Apr 11, 2016 at 11:43 AM, prabhu Mahendran <
>>> prabhuu161...@gmail.com> wrote:
>>>
>>>> Hi Simon/Joe,
>>>>
>>>> Thanks for this support.
>>>> I have successfully converted the CSV data into JSON and also insert
>>>> those JSON data into Hbase Table using PutHBaseJSon.
>>>> Part of JSON Sample Data like below:
>>>>
>>>> {
>>>> "n1":"",
>>>> "n2":"",
>>>> "n3":"",
>>>> "n4":"","n5":"","n6":"",
>>>> "n7":"",
>>>> "n8":"",
>>>> "n9":"",
>>>>
>>>> "n10":"","n11":"","n12":"","n13":"","n14":"","n15":"","n16":"",
>>>>
>>>> "n17":"","n18":"","n19":"","n20":"","n21":"-",
>>>> "n22":""
>>>>
>>>> }
>>>> PutHBaseJSON:
>>>>Table Name is 'Hike' , Column Family:'Sweet' ,Row
>>>> Identifier Field Name:n1(Element in JSON File).
>>>>
>>>> My Record Contains 15 lacks rows but HBaseTable contains only 10 rows.
>>>> It Can Read the 15 lacks rows but stores minimum rows.
>>>>
>>>> Anyone please help me to solve this?
>>>>
>>>>
>>>>
>>>>
>>>> Prabhu,
>>>>
>>>> If the dataset being processed can be split up and still retain the
>>>> necessary meaning when input to HBase I'd recommend doing that.  NiFI
>>>> itself, as a framework, can handle very large objects because its API
>>>> doesn't force loading of entire objects into memory.  However, various
>>>> processors may do that and I believe ReplaceText may be one that does.
>>>> You can use SplitText or ExecuteScript or other processors to do that
>>>> splitting if that will help your case.
>>>>
>>>> Thanks
>>>> Joe
>>>>
>>>> On Sat, Apr 9, 2016 at 6:35 PM, Simon Ball <sb...@hortonworks.com>
>>>> wrote:
>>>> > Hi Prabhu,
>>>> >
>>>> > Did you try increasing the heap size in conf/bootstrap.conf? By
>>>> default nifi
>>>> > uses a very small RAM allocation (512MB). You can increase this by
>>>> tweaking
>>>> > java.arg.2 and .3 in the bootstrap.conf file. Note that this is the
>>>> java
>>>> > heap, so you will need more than your data size to account for java
>>>> object
>>>> > overhead. The other thing to check is the buffer sizes you are using
>>>> for
>>>> > your replace text processors. If you’re also using Split processors,
>>>> you can
>>>> > sometime run up against RAM and open file limits, if this is the
>>>> case, make
>>>> > sure you increase the ulimit -n settings.
>>>> >
>>>> > Simon
>>>> >
>>>> > On 9 Apr 2016, at 16:51, prabhu Mahendran <prabhuu161...@gmail.com>
>>>> wrote:
>>>> >
>>>> > Hi,
>>>> >
>>>> > I am new to nifi and does not know how to process large data like one
>>>> gb csv
>>>> > data into hbase.while try combination of getFile and putHbase shell
>>>> leads
>>>> > Java Out of memory error and also try combination of replace text,
>>>> extract
>>>> > text and puthbasejson doesn't work on large dataset but it work
>>>> correctly in
>>>> > smaller dataset.
>>>> > Can anyone please help me to solve this?
>>>> > Thanks in advance.
>>>> >
>>>> > Thanks & Regards,
>>>> > Prabhu Mahendran
>>>> >
>>>> >
>>>>
>>>
>>>
>>
>


Large dataset on hbase

2016-04-09 Thread prabhu Mahendran
Hi,

I am new to nifi and does not know how to process large data like one gb
csv data into hbase.while try combination of getFile and putHbase shell
leads Java Out of memory error and also try combination of replace text,
extract text and puthbasejson doesn't work on large dataset but it work
correctly in smaller dataset.
Can anyone please help me to solve this?
Thanks in advance.

Thanks & Regards,
Prabhu Mahendran


Re: Sqoop Support in NIFI

2016-04-01 Thread prabhu Mahendran
Is there is any way to store the exact data in hdfs from
databases(oracle,mysql,sqlserver) without convert the data into avro or
json?.

On Wed, Mar 30, 2016 at 2:51 PM, Simon Ball <sb...@hortonworks.com> wrote:

> Are you planning to use something like Hive or Spark to query the data?
> Both will work fine with Avro formatted data under a table. I’m not sure
> what you mean by “Table Structure” or if you have a particular format in
> mind, but there is I believe talk of adding processors that will write
> direct to ORC format so convert the Avro data to ORC within NiFi.
>
> Simon
>
> On 30 Mar 2016, at 07:06, prabhu Mahendran <prabhuu161...@gmail.com>
> wrote:
>
> For Below reasons i have choose Sqoop in NIFI Processor is the best method
> to move data in Table Structure.
>
> If once move the Table from oracle or sql server into HDFS then whole
> moved data which must be in Table format not in avro or
> json..etc.
>
> For Example:Table Data from Oracle which is in form of Table Structure
> and using Execute SQL to move those data into HDFS  which is in
> avro or json format.but i need that data in Table Structure.
>
> And I have try QueryDatabaseTable Processor in nifi-0.6.0 It can return
> the Table record in avro format but i need those data in Table Structure.
>
> So anyone please help me to solve this.
>
>
>
>
>
> On Tue, Mar 29, 2016 at 3:02 PM, Simon Ball <sb...@hortonworks.com> wrote:
>
>> Another processor that may be of interest to you is the
>> QueryDatabaseTable processor, which has just been released in 0.6.0. This
>> provides incremental load capabilities similar to sqoop.
>>
>> If you’re looking for the schema type functionality, bear in mind that
>> the ExecuteSQL (and new Query processor) preserve schema with Avro.
>>
>> Sqoop also allows import to HBase, which you can do with PutHBaseJson
>> (use the ConvertAvroToJson processor to feed this).
>>
>> Distributed partitoned queries isn’t in there yet, but I believe is on
>> the way, so sqoop may have the edge for that use case today.
>>
>> Granted, NiFi doesn’t have much by way of HCatalog integration at the
>> moment, but most of the functionality you’ll find in Sqoop is in NiFi.
>> Unless you are looking to move terabytes at a time, then NiFi should be
>> able to handle most of what you would use sqoop for, so it would be very
>> interesting to hear more detail on your use case, and why you needed sqoop
>> on top of NiFi.
>>
>> Simon
>>
>>
>> On 29 Mar 2016, at 09:06, prabhu Mahendran <prabhuu161...@gmail.com>
>> wrote:
>>
>> Hi,
>>
>> Yes, In my case i have created the Custom processor with Sqoop API which
>> accommodates complete functionality of sqoop.
>> As per you concern we have able to move the data only from HDFS to SQl or
>> Vice versa, But sqoop having more functionality which we can achieve it by
>> Sqoop.RunTool() in org.apache.sqoop.sqoop. The Sqoop Java client will works
>> well and Implement that API into new Sqoop NIFI processor Doesn't work!
>>
>> On Tue, Mar 29, 2016 at 12:49 PM, Conrad Crampton <
>> conrad.cramp...@secdata.com> wrote:
>>
>>> Hi,
>>> If you could explain exactly what you are trying to achieve I.e. What
>>> part of the data pipeline you are looking to use NiFi for and where you
>>> wish to retain Sqoop I could perhaps have a more informed input (although I
>>> have only been using NiFi myself for a few weeks). Sqoop obviously can move
>>> the data from RDBM systems through to HDFS (and vice versa) as can NiFi,
>>> not sure why you would want the mix (or at least I can’t see it from the
>>> description you have provided thus far).
>>> I have limited knowledge of Sqoop, but either way, I am sure you could
>>> ‘drive’ Sqoop from a custom NiFi processor if you so choose, and you can
>>> ‘drive’ NiFi externally (using the REST api) - if Sqoop can consume it.
>>> Regards
>>> Conrad
>>>
>>>
>>> From: prabhu Mahendran <prabhuu161...@gmail.com>
>>> Reply-To: "users@nifi.apache.org" <users@nifi.apache.org>
>>> Date: Tuesday, 29 March 2016 at 07:55
>>> To: "users@nifi.apache.org" <users@nifi.apache.org>
>>> Subject: Re: Sqoop Support in NIFI
>>>
>>> Hi Conrad,
>>>
>>> Thanks for Quick Response.
>>>
>>> Yeah.Combination of Execute SQL and Put HDFS works well instead
>>> of Sqoop.But is there any possible to use Sqoop(client) to do like this?
>>

Re: Sqoop Support in NIFI

2016-03-30 Thread prabhu Mahendran
For Below reasons i have choose Sqoop in NIFI Processor is the best method
to move data in Table Structure.

If once move the Table from oracle or sql server into HDFS then whole
moved data which must be in Table format not in avro or
json..etc.

For Example:Table Data from Oracle which is in form of Table Structure
and using Execute SQL to move those data into HDFS  which is in
avro or json format.but i need that data in Table Structure.

And I have try QueryDatabaseTable Processor in nifi-0.6.0 It can return the
Table record in avro format but i need those data in Table Structure.

So anyone please help me to solve this.





On Tue, Mar 29, 2016 at 3:02 PM, Simon Ball <sb...@hortonworks.com> wrote:

> Another processor that may be of interest to you is the QueryDatabaseTable
> processor, which has just been released in 0.6.0. This provides incremental
> load capabilities similar to sqoop.
>
> If you’re looking for the schema type functionality, bear in mind that the
> ExecuteSQL (and new Query processor) preserve schema with Avro.
>
> Sqoop also allows import to HBase, which you can do with PutHBaseJson (use
> the ConvertAvroToJson processor to feed this).
>
> Distributed partitoned queries isn’t in there yet, but I believe is on the
> way, so sqoop may have the edge for that use case today.
>
> Granted, NiFi doesn’t have much by way of HCatalog integration at the
> moment, but most of the functionality you’ll find in Sqoop is in NiFi.
> Unless you are looking to move terabytes at a time, then NiFi should be
> able to handle most of what you would use sqoop for, so it would be very
> interesting to hear more detail on your use case, and why you needed sqoop
> on top of NiFi.
>
> Simon
>
>
> On 29 Mar 2016, at 09:06, prabhu Mahendran <prabhuu161...@gmail.com>
> wrote:
>
> Hi,
>
> Yes, In my case i have created the Custom processor with Sqoop API which
> accommodates complete functionality of sqoop.
> As per you concern we have able to move the data only from HDFS to SQl or
> Vice versa, But sqoop having more functionality which we can achieve it by
> Sqoop.RunTool() in org.apache.sqoop.sqoop. The Sqoop Java client will works
> well and Implement that API into new Sqoop NIFI processor Doesn't work!
>
> On Tue, Mar 29, 2016 at 12:49 PM, Conrad Crampton <
> conrad.cramp...@secdata.com> wrote:
>
>> Hi,
>> If you could explain exactly what you are trying to achieve I.e. What
>> part of the data pipeline you are looking to use NiFi for and where you
>> wish to retain Sqoop I could perhaps have a more informed input (although I
>> have only been using NiFi myself for a few weeks). Sqoop obviously can move
>> the data from RDBM systems through to HDFS (and vice versa) as can NiFi,
>> not sure why you would want the mix (or at least I can’t see it from the
>> description you have provided thus far).
>> I have limited knowledge of Sqoop, but either way, I am sure you could
>> ‘drive’ Sqoop from a custom NiFi processor if you so choose, and you can
>> ‘drive’ NiFi externally (using the REST api) - if Sqoop can consume it.
>> Regards
>> Conrad
>>
>>
>> From: prabhu Mahendran <prabhuu161...@gmail.com>
>> Reply-To: "users@nifi.apache.org" <users@nifi.apache.org>
>> Date: Tuesday, 29 March 2016 at 07:55
>> To: "users@nifi.apache.org" <users@nifi.apache.org>
>> Subject: Re: Sqoop Support in NIFI
>>
>> Hi Conrad,
>>
>> Thanks for Quick Response.
>>
>> Yeah.Combination of Execute SQL and Put HDFS works well instead
>> of Sqoop.But is there any possible to use Sqoop(client) to do like this?
>>
>> Prabhu Mahendran
>>
>> On Tue, Mar 29, 2016 at 12:04 PM, Conrad Crampton <
>> conrad.cramp...@secdata.com> wrote:
>>
>>> Hi,
>>> Why use sqoop at all? Use a combination of ExecuteSQL [1] and PutHDFS
>>> [2].
>>> I have just replace the use of Flume using a combination of ListenSyslog
>>> and PutHDFS which I guess is a similar architectural pattern.
>>> HTH
>>> Conrad
>>>
>>>
>>>
>>> http://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.standard.ExecuteSQL/index.html
>>>  [1]
>>>
>>> http://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.hadoop.PutHDFS/index.html
>>>  [2]
>>>
>>> From: prabhu Mahendran <prabhuu161...@gmail.com>
>>> Reply-To: "users@nifi.apache.org" <users@nifi.apache.org>
>>> Date: Tuesday, 29 March 2016 at 07:27
>>> To: "users@nifi.apache.org" <users@nifi.apache.or

Re: Sqoop Support in NIFI

2016-03-29 Thread prabhu Mahendran
Hi,

Yes, In my case i have created the Custom processor with Sqoop API which
accommodates complete functionality of sqoop.
As per you concern we have able to move the data only from HDFS to SQl or
Vice versa, But sqoop having more functionality which we can achieve it by
Sqoop.RunTool() in org.apache.sqoop.sqoop. The Sqoop Java client will works
well and Implement that API into new Sqoop NIFI processor Doesn't work!

On Tue, Mar 29, 2016 at 12:49 PM, Conrad Crampton <
conrad.cramp...@secdata.com> wrote:

> Hi,
> If you could explain exactly what you are trying to achieve I.e. What part
> of the data pipeline you are looking to use NiFi for and where you wish to
> retain Sqoop I could perhaps have a more informed input (although I have
> only been using NiFi myself for a few weeks). Sqoop obviously can move the
> data from RDBM systems through to HDFS (and vice versa) as can NiFi, not
> sure why you would want the mix (or at least I can’t see it from the
> description you have provided thus far).
> I have limited knowledge of Sqoop, but either way, I am sure you could
> ‘drive’ Sqoop from a custom NiFi processor if you so choose, and you can
> ‘drive’ NiFi externally (using the REST api) - if Sqoop can consume it.
> Regards
> Conrad
>
>
> From: prabhu Mahendran <prabhuu161...@gmail.com>
> Reply-To: "users@nifi.apache.org" <users@nifi.apache.org>
> Date: Tuesday, 29 March 2016 at 07:55
> To: "users@nifi.apache.org" <users@nifi.apache.org>
> Subject: Re: Sqoop Support in NIFI
>
> Hi Conrad,
>
> Thanks for Quick Response.
>
> Yeah.Combination of Execute SQL and Put HDFS works well instead
> of Sqoop.But is there any possible to use Sqoop(client) to do like this?
>
> Prabhu Mahendran
>
> On Tue, Mar 29, 2016 at 12:04 PM, Conrad Crampton <
> conrad.cramp...@secdata.com> wrote:
>
>> Hi,
>> Why use sqoop at all? Use a combination of ExecuteSQL [1] and PutHDFS [2].
>> I have just replace the use of Flume using a combination of ListenSyslog
>> and PutHDFS which I guess is a similar architectural pattern.
>> HTH
>> Conrad
>>
>>
>>
>> http://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.standard.ExecuteSQL/index.html
>>  [1]
>>
>> http://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.hadoop.PutHDFS/index.html
>>  [2]
>>
>> From: prabhu Mahendran <prabhuu161...@gmail.com>
>> Reply-To: "users@nifi.apache.org" <users@nifi.apache.org>
>> Date: Tuesday, 29 March 2016 at 07:27
>> To: "users@nifi.apache.org" <users@nifi.apache.org>
>> Subject: Sqoop Support in NIFI
>>
>> Hi,
>>
>> I am new to nifi.
>>
>>I have to know that  "Is there is any Support for Sqoop with help
>> of NIFI Processors?."
>>
>> And in which way to done the following case with help of Sqoop.
>>
>> Move data from oracle,SqlServer,MySql into HDFS and vice versa.
>>
>>
>> Thanks,
>> Prabhu Mahendran
>>
>>
>>
>>
>> ***This email originated outside SecureData***
>>
>> Click here <https://www.mailcontrol.com/sr/MZbqvYs5QwJvpeaetUwhCQ==> to
>> report this email as spam.
>>
>>
>> SecureData, combating cyber threats
>>
>> --
>>
>> The information contained in this message or any of its attachments may
>> be privileged and confidential and intended for the exclusive use of the
>> intended recipient. If you are not the intended recipient any disclosure,
>> reproduction, distribution or other dissemination or use of this
>> communications is strictly prohibited. The views expressed in this email
>> are those of the individual and not necessarily of SecureData Europe Ltd.
>> Any prices quoted are only valid if followed up by a formal written quote.
>>
>> SecureData Europe Limited. Registered in England & Wales 04365896.
>> Registered Address: SecureData House, Hermitage Court, Hermitage Lane,
>> Maidstone, Kent, ME16 9NT
>>
>
>


Re: Sqoop Support in NIFI

2016-03-29 Thread prabhu Mahendran
Hi Conrad,

Thanks for Quick Response.

Yeah.Combination of Execute SQL and Put HDFS works well instead
of Sqoop.But is there any possible to use Sqoop(client) to do like this?

Prabhu Mahendran

On Tue, Mar 29, 2016 at 12:04 PM, Conrad Crampton <
conrad.cramp...@secdata.com> wrote:

> Hi,
> Why use sqoop at all? Use a combination of ExecuteSQL [1] and PutHDFS [2].
> I have just replace the use of Flume using a combination of ListenSyslog
> and PutHDFS which I guess is a similar architectural pattern.
> HTH
> Conrad
>
>
>
> http://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.standard.ExecuteSQL/index.html
>  [1]
>
> http://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.hadoop.PutHDFS/index.html
>  [2]
>
> From: prabhu Mahendran <prabhuu161...@gmail.com>
> Reply-To: "users@nifi.apache.org" <users@nifi.apache.org>
> Date: Tuesday, 29 March 2016 at 07:27
> To: "users@nifi.apache.org" <users@nifi.apache.org>
> Subject: Sqoop Support in NIFI
>
> Hi,
>
> I am new to nifi.
>
>I have to know that  "Is there is any Support for Sqoop with help
> of NIFI Processors?."
>
> And in which way to done the following case with help of Sqoop.
>
> Move data from oracle,SqlServer,MySql into HDFS and vice versa.
>
>
> Thanks,
> Prabhu Mahendran
>
>
>
>
> ***This email originated outside SecureData***
>
> Click here <https://www.mailcontrol.com/sr/MZbqvYs5QwJvpeaetUwhCQ==> to
> report this email as spam.
>
>
> SecureData, combating cyber threats
>
> --
>
> The information contained in this message or any of its attachments may be
> privileged and confidential and intended for the exclusive use of the
> intended recipient. If you are not the intended recipient any disclosure,
> reproduction, distribution or other dissemination or use of this
> communications is strictly prohibited. The views expressed in this email
> are those of the individual and not necessarily of SecureData Europe Ltd.
> Any prices quoted are only valid if followed up by a formal written quote.
>
> SecureData Europe Limited. Registered in England & Wales 04365896.
> Registered Address: SecureData House, Hermitage Court, Hermitage Lane,
> Maidstone, Kent, ME16 9NT
>


Sqoop Support in NIFI

2016-03-29 Thread prabhu Mahendran
Hi,

I am new to nifi.

   I have to know that  "Is there is any Support for Sqoop with help of
NIFI Processors?."

And in which way to done the following case with help of Sqoop.

Move data from oracle,SqlServer,MySql into HDFS and vice versa.


Thanks,
Prabhu Mahendran


StoreInKiteDataset Processor for Hive

2016-03-02 Thread prabhu Mahendran
Hi,

I have checked Nifi-0.5.1 Binary Source and try to use Hive Support in
StoreinKite Processor. But it shows like below.

java.lang.illegal argumentexception:Missing hive metastore connection URI:


And My Target DataSetURI :dataset:hive:default/customers2


Anyone help me for solve above issue, is there any Hive dependency required
for storing it ?

Best,
Prabhu Mahendran