[ 
https://issues.apache.org/jira/browse/NIFI-751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14617932#comment-14617932
 ] 

ASF GitHub Bot commented on NIFI-751:
-------------------------------------

Github user joewitt commented on the pull request:

    https://github.com/apache/incubator-nifi/pull/70#issuecomment-119416855
  
    Alan,
    
    Thanks for contributing!  We'll work with you to get this promptly merged.  
Here are some findings of an initial review:
    
    * Please run the build with 'mvn clean install -Pcontrib-check'.  You'll 
find there many formatting issues.  It appears that many of the lines have had 
extraneous newlines and tabs added to them.  Please take a look and tweak until 
the contrib check runs cleanly and that you still think the code looks good.
    
    * There do not appear to be any unit tests for the processor itself.  I do 
see a couple unit tests for the converter class/logic which is good but it is 
best to also have a test or two, particularly for something that seems as 
easily tested as this one is.  You can find lots of examples throughout the 
codebase including in this kite bundle.
    
    * The conversion routines for Long, Double, Float, etc.. you really should 
consider adding regex checks before calling those methods.  Since the Java 
methods use exception handling for flow control the performance penalty can be 
extremely severe compared to simply doing a regex check beforehand.  In the 
event that the data is clean you'll be good to go but when it isn't the impact 
to the system as a whole can be dramatic.  Given the nature of this sort of 
processor that is probably something to tackle right away.
    
    * This processor is probably a great candidate to use the 'Advanced 
Documentation' feature.  Users will need this to understand the schema/syntax 
of the conversion configuration and examples would go a long way for that.  You 
can see more about this here 
http://nifi.incubator.apache.org/docs/nifi-docs/html/developer-guide.html#advanced-documentation
 and there are some examples in the existing standard processors to consider.
    
    * There are a couple of copy/paste errors in the processor from the 
CSV/Avro converter.  Look for these "Failed to convert {}/{} records from CSV 
to Avro" and "Failed to convert {}/{} records from CSV to Avro"
    
    I realize this looks like a lot of stuff but it should be pretty easy to 
address and is a good first step.  If you have any questions on it just let us 
know.
    
    Thanks
    Joe


> Add Processor To Convert Avro Formats
> -------------------------------------
>
>                 Key: NIFI-751
>                 URL: https://issues.apache.org/jira/browse/NIFI-751
>             Project: Apache NiFi
>          Issue Type: Improvement
>          Components: Extensions
>    Affects Versions: 0.1.0
>            Reporter: Alan Jackoway
>
> When working with data from external sources, such as complex WSDL, I 
> frequently wind up with complex nested data that is difficult to work with 
> even when converted to Avro format. Specifically, I often have two needs:
> * Converting types of data, usually from string to long, double, etc. when 
> APIs give only string data back.
> * Flattening data by taking fields out of nested records and putting them on 
> the top level of the Avro file.
> Unfortunately the Kite JSONToAvro processor only supports exact conversions 
> from JSON to a matching Avro schema and will not do data transformations of 
> this type. Proposed processor to come.
> Discussed this with [~rdblue], so tagging him here as I don't have permission 
> to set a CC for some reason.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to