Re: Create a CovertCSVToJSON Processor

2017-06-06 Thread Mark Payne
Venkat, I actually started work on NIFI-3921 today. It touches a lot of things because there is such much now built upon the record readers abs writers. So I just need to be very diligent in my testing to ensure that I don't break anything. Hopefully will have a PR up later this week. Thanks -

Re: Create a CovertCSVToJSON Processor

2017-06-06 Thread Venkat Williams
Thanks Mark for the valuable inputs. SplitRecord is way to handle multiline records and NIFI-3921 helps us to avoid needing schema when we can use the CSV header row itself as schema. Anyone working on the NIFI-3921 issue if not I can take it up. Regards, Venkat On Tue, Jun 6, 2017 at 10:06 PM,

Re: Create a CovertCSVToJSON Processor

2017-06-06 Thread Venkat Williams
Thanks Matt that sounds good intermediate solution as of now.. Regards, Venkat On Jun 6, 2017 22:16, "Matt Burgess" wrote: > Venkat, > > In the meantime, i have a Groovy script for ExecuteScript [1] that will > read the header and create an Avro schema (stored in the avro.schema > attribute) s

Re: Create a CovertCSVToJSON Processor

2017-06-06 Thread Matt Burgess
Venkat, In the meantime, i have a Groovy script for ExecuteScript [1] that will read the header and create an Avro schema (stored in the avro.schema attribute) so you can set the access strategy to Use Schema Text. It works like the Use Header Fields strategy in CSVReader, meaning all fields ar

Re: Create a CovertCSVToJSON Processor

2017-06-06 Thread Mark Payne
Venkat, If you do need to split the data up, there is now a SplitRecord processor that you can use to accomplish that with the readers and writers. So that won't have problems with CSV fields that span multiple lines. Unfortunately at this time, the writer does require that a schema registry be

Re: Create a CovertCSVToJSON Processor

2017-06-06 Thread Venkat Williams
Hi Joe and Mark, Thanks a lot for your prompt response. I wasn't able to able consider SplitText because CSV Records field values can fall in to next line with embedded newlines, escaped double-quotes, etc. So I have rule out any logic related to Split. Another question is it possible to convert

Re: Create a CovertCSVToJSON Processor

2017-06-06 Thread Joe Witt
Venkat, The only heap issues that could be consider common are if you're doing 'SplitText' and trying to go from hundreds of thousands or millions of lines files to a single line output in a single processor. You can easily overcome that by doing a two phase split where the first processor cuts i

Re: Create a CovertCSVToJSON Processor

2017-06-06 Thread Venkat Williams
Thanks Mark for helping me to build a template and test Covert CSV to JSON processing. I want to know is it possible to emit transformed records as it is to next processor rather than waiting for full file processing and keep the entire result in single flowfile. Input: id,topic,hits Rahul,scala,

Re: Create a CovertCSVToJSON Processor

2017-06-06 Thread Mark Payne
Hi Venkat, I just published a blog post [1] on running SQL in NiFi. The post walks through creating a CSV Record Reader, running SQL over the data, and then writing the results in JSON. This may be helpful to you. In your case, you may want to just use the ConvertRecord processor, rather than Qu

Re: Create a CovertCSVToJSON Processor

2017-06-06 Thread Venkat Williams
Hi Joe Witt, Thanks for your response. I heard and read about about these record readers but not quite got it how to use them using some test data or template. It will be great if you can help me to get some working example or flow. I want to know if these implementations support for RFC-4180

Re: Create a CovertCSVToJSON Processor

2017-06-06 Thread Joe Witt
Venkat I think you'll want to take a closer look at the apache nifi 1.2.0 release support for record readers and record writers. It handles schema aware parsing/transformation and more for things like csv, json, avro, can be easily extended, and supports scripted readers and writers written right