[jira] [Commented] (NIFI-6241) ConvertRecord Schema Inference fails to infer complete schema, or simply fails

2021-09-02 Thread John Wise (Jira)


[ 
https://issues.apache.org/jira/browse/NIFI-6241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17408889#comment-17408889
 ] 

John Wise commented on NIFI-6241:
-

[~dsargrad] - I'm not sure if you've figured this out in the interim, but the 
JsonRecordSetWriter has a "Schema Write Strategy" property which can be set to 
"Set 'avro.schema' Attribute".  The writer will then write the inferred schema 
into that attribute.  We use that regularly for new or problematic datasets to 
create or debug conversions.

> ConvertRecord Schema Inference fails to infer complete schema, or simply fails
> --
>
> Key: NIFI-6241
> URL: https://issues.apache.org/jira/browse/NIFI-6241
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Core Framework
>Affects Versions: 1.9.2
>Reporter: David Sargrad
>Priority: Major
> Attachments: Reproduce_ConvertRecord_Shortcoming.xml, 
> image-2019-04-24-13-38-16-605.png, image-2019-04-24-13-39-36-327.png, 
> image-2019-04-24-13-41-00-704.png, image-2019-04-24-13-41-26-860.png, 
> image-2019-04-24-13-43-28-531.png, image-2019-04-24-13-43-59-706.png, 
> image-2019-04-24-17-03-10-728.png, image-2019-04-25-09-13-52-416.png, 
> image-2019-04-25-09-19-15-406.png, image-2019-04-25-09-30-08-297.png, 
> image-2019-05-20-09-01-02-488.png
>
>
> I've got a simple test flow as depicted below:
>  
>  
> !image-2019-04-24-13-38-16-605.png!
>  
> The input XML is:
> !image-2019-04-24-13-41-26-860.png!
>  
> The output JSON is almost correct, yet it is missing two critical fields 
> (they both show up as "null". The null fields are 
> {color:#ff}position{color} and {color:#ff}ncsmTrackData{color}. It is 
> also missing all of the attributes on fltdMessage.
>  
> !image-2019-04-24-13-41-00-704.png!
>  
> The configuration of my ConvertRecord is:
> !image-2019-04-24-13-43-28-531.png!
>  
> My XMLReader configuration is:
> !image-2019-04-24-13-43-59-706.png!
>  
>  Questions:
>  # Why are these two fields null? 
>  # Why are all the fltdMessage attributes being ignored?
> It would seem that this is a bug, or at least a major shortcoming, in the 
> schema inference capability. If there were a way for me to view the inferred 
> schema, then I could use that as a starting point. However its not clear from 
> the documentation how to view that schema.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (NIFI-6241) ConvertRecord Schema Inference fails to infer complete schema, or simply fails

2019-05-20 Thread David Sargrad (JIRA)


[ 
https://issues.apache.org/jira/browse/NIFI-6241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16844065#comment-16844065
 ] 

David Sargrad commented on NIFI-6241:
-

Hi Otto.

I'm not sure I follow your point. My XMLReader is configured with a schema 
access strategy of  "Infer Schema" .

> ConvertRecord Schema Inference fails to infer complete schema, or simply fails
> --
>
> Key: NIFI-6241
> URL: https://issues.apache.org/jira/browse/NIFI-6241
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Core Framework
>Affects Versions: 1.9.2
>Reporter: David Sargrad
>Priority: Major
> Attachments: Reproduce_ConvertRecord_Shortcoming.xml, 
> image-2019-04-24-13-38-16-605.png, image-2019-04-24-13-39-36-327.png, 
> image-2019-04-24-13-41-00-704.png, image-2019-04-24-13-41-26-860.png, 
> image-2019-04-24-13-43-28-531.png, image-2019-04-24-13-43-59-706.png, 
> image-2019-04-24-17-03-10-728.png, image-2019-04-25-09-13-52-416.png, 
> image-2019-04-25-09-19-15-406.png, image-2019-04-25-09-30-08-297.png, 
> image-2019-05-20-09-01-02-488.png
>
>
> I've got a simple test flow as depicted below:
>  
>  
> !image-2019-04-24-13-38-16-605.png!
>  
> The input XML is:
> !image-2019-04-24-13-41-26-860.png!
>  
> The output JSON is almost correct, yet it is missing two critical fields 
> (they both show up as "null". The null fields are 
> {color:#ff}position{color} and {color:#ff}ncsmTrackData{color}. It is 
> also missing all of the attributes on fltdMessage.
>  
> !image-2019-04-24-13-41-00-704.png!
>  
> The configuration of my ConvertRecord is:
> !image-2019-04-24-13-43-28-531.png!
>  
> My XMLReader configuration is:
> !image-2019-04-24-13-43-59-706.png!
>  
>  Questions:
>  # Why are these two fields null? 
>  # Why are all the fltdMessage attributes being ignored?
> It would seem that this is a bug, or at least a major shortcoming, in the 
> schema inference capability. If there were a way for me to view the inferred 
> schema, then I could use that as a starting point. However its not clear from 
> the documentation how to view that schema.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (NIFI-6241) ConvertRecord Schema Inference fails to infer complete schema, or simply fails

2019-05-20 Thread Otto Fowler (JIRA)


[ 
https://issues.apache.org/jira/browse/NIFI-6241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16844044#comment-16844044
 ] 

Otto Fowler commented on NIFI-6241:
---

I'm confused, in your preproduction case, the generate flow flow has an avro 
schema that doesn't have the fields that you are missing

> ConvertRecord Schema Inference fails to infer complete schema, or simply fails
> --
>
> Key: NIFI-6241
> URL: https://issues.apache.org/jira/browse/NIFI-6241
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Core Framework
>Affects Versions: 1.9.2
>Reporter: David Sargrad
>Priority: Major
> Attachments: Reproduce_ConvertRecord_Shortcoming.xml, 
> image-2019-04-24-13-38-16-605.png, image-2019-04-24-13-39-36-327.png, 
> image-2019-04-24-13-41-00-704.png, image-2019-04-24-13-41-26-860.png, 
> image-2019-04-24-13-43-28-531.png, image-2019-04-24-13-43-59-706.png, 
> image-2019-04-24-17-03-10-728.png, image-2019-04-25-09-13-52-416.png, 
> image-2019-04-25-09-19-15-406.png, image-2019-04-25-09-30-08-297.png, 
> image-2019-05-20-09-01-02-488.png
>
>
> I've got a simple test flow as depicted below:
>  
>  
> !image-2019-04-24-13-38-16-605.png!
>  
> The input XML is:
> !image-2019-04-24-13-41-26-860.png!
>  
> The output JSON is almost correct, yet it is missing two critical fields 
> (they both show up as "null". The null fields are 
> {color:#ff}position{color} and {color:#ff}ncsmTrackData{color}. It is 
> also missing all of the attributes on fltdMessage.
>  
> !image-2019-04-24-13-41-00-704.png!
>  
> The configuration of my ConvertRecord is:
> !image-2019-04-24-13-43-28-531.png!
>  
> My XMLReader configuration is:
> !image-2019-04-24-13-43-59-706.png!
>  
>  Questions:
>  # Why are these two fields null? 
>  # Why are all the fltdMessage attributes being ignored?
> It would seem that this is a bug, or at least a major shortcoming, in the 
> schema inference capability. If there were a way for me to view the inferred 
> schema, then I could use that as a starting point. However its not clear from 
> the documentation how to view that schema.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (NIFI-6241) ConvertRecord Schema Inference fails to infer complete schema, or simply fails

2019-05-20 Thread David Sargrad (JIRA)


[ 
https://issues.apache.org/jira/browse/NIFI-6241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16843946#comment-16843946
 ] 

David Sargrad commented on NIFI-6241:
-

Is it easy to expose the structure of the inferred schema? I was looking to use 
that inferred schema as a starting point for a schema that I further refine by 
hand. I did not figure out how to do that.

> ConvertRecord Schema Inference fails to infer complete schema, or simply fails
> --
>
> Key: NIFI-6241
> URL: https://issues.apache.org/jira/browse/NIFI-6241
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Core Framework
>Affects Versions: 1.9.2
>Reporter: David Sargrad
>Priority: Major
> Attachments: Reproduce_ConvertRecord_Shortcoming.xml, 
> image-2019-04-24-13-38-16-605.png, image-2019-04-24-13-39-36-327.png, 
> image-2019-04-24-13-41-00-704.png, image-2019-04-24-13-41-26-860.png, 
> image-2019-04-24-13-43-28-531.png, image-2019-04-24-13-43-59-706.png, 
> image-2019-04-24-17-03-10-728.png, image-2019-04-25-09-13-52-416.png, 
> image-2019-04-25-09-19-15-406.png, image-2019-04-25-09-30-08-297.png, 
> image-2019-05-20-09-01-02-488.png
>
>
> I've got a simple test flow as depicted below:
>  
>  
> !image-2019-04-24-13-38-16-605.png!
>  
> The input XML is:
> !image-2019-04-24-13-41-26-860.png!
>  
> The output JSON is almost correct, yet it is missing two critical fields 
> (they both show up as "null". The null fields are 
> {color:#ff}position{color} and {color:#ff}ncsmTrackData{color}. It is 
> also missing all of the attributes on fltdMessage.
>  
> !image-2019-04-24-13-41-00-704.png!
>  
> The configuration of my ConvertRecord is:
> !image-2019-04-24-13-43-28-531.png!
>  
> My XMLReader configuration is:
> !image-2019-04-24-13-43-59-706.png!
>  
>  Questions:
>  # Why are these two fields null? 
>  # Why are all the fltdMessage attributes being ignored?
> It would seem that this is a bug, or at least a major shortcoming, in the 
> schema inference capability. If there were a way for me to view the inferred 
> schema, then I could use that as a starting point. However its not clear from 
> the documentation how to view that schema.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (NIFI-6241) ConvertRecord Schema Inference fails to infer complete schema, or simply fails

2019-05-20 Thread David Sargrad (JIRA)


[ 
https://issues.apache.org/jira/browse/NIFI-6241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16843943#comment-16843943
 ] 

David Sargrad commented on NIFI-6241:
-

Hi. 

Thank you for this response. I do think your idea of relaxing the requirement 
for a root tag even if there is only one record in the flow file is a good 
idea. I do think that often someone will have a structure such as the one in my 
example.

 

Relative to your second point, I am not sure I fully comprehend the expected 
behavior as you describe it. I'd have expected that the inference engine give 
me an inferred structure for the fltdMessage. Specifically this would include 
values for the following:

!image-2019-05-20-09-01-02-488.png!

 

> ConvertRecord Schema Inference fails to infer complete schema, or simply fails
> --
>
> Key: NIFI-6241
> URL: https://issues.apache.org/jira/browse/NIFI-6241
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Core Framework
>Affects Versions: 1.9.2
>Reporter: David Sargrad
>Priority: Major
> Attachments: Reproduce_ConvertRecord_Shortcoming.xml, 
> image-2019-04-24-13-38-16-605.png, image-2019-04-24-13-39-36-327.png, 
> image-2019-04-24-13-41-00-704.png, image-2019-04-24-13-41-26-860.png, 
> image-2019-04-24-13-43-28-531.png, image-2019-04-24-13-43-59-706.png, 
> image-2019-04-24-17-03-10-728.png, image-2019-04-25-09-13-52-416.png, 
> image-2019-04-25-09-19-15-406.png, image-2019-04-25-09-30-08-297.png, 
> image-2019-05-20-09-01-02-488.png
>
>
> I've got a simple test flow as depicted below:
>  
>  
> !image-2019-04-24-13-38-16-605.png!
>  
> The input XML is:
> !image-2019-04-24-13-41-26-860.png!
>  
> The output JSON is almost correct, yet it is missing two critical fields 
> (they both show up as "null". The null fields are 
> {color:#ff}position{color} and {color:#ff}ncsmTrackData{color}. It is 
> also missing all of the attributes on fltdMessage.
>  
> !image-2019-04-24-13-41-00-704.png!
>  
> The configuration of my ConvertRecord is:
> !image-2019-04-24-13-43-28-531.png!
>  
> My XMLReader configuration is:
> !image-2019-04-24-13-43-59-706.png!
>  
>  Questions:
>  # Why are these two fields null? 
>  # Why are all the fltdMessage attributes being ignored?
> It would seem that this is a bug, or at least a major shortcoming, in the 
> schema inference capability. If there were a way for me to view the inferred 
> schema, then I could use that as a starting point. However its not clear from 
> the documentation how to view that schema.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (NIFI-6241) ConvertRecord Schema Inference fails to infer complete schema, or simply fails

2019-04-25 Thread Matt Burgess (JIRA)


[ 
https://issues.apache.org/jira/browse/NIFI-6241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16826255#comment-16826255
 ] 

Matt Burgess commented on NIFI-6241:


I believe the need for a root tag is because the record-based processors are 
meant to work on flow files containing multiple records. Currently for the 
XMLReader it expects a root tag even if there is only one record in the flow 
file. Perhaps it is possible to relax this requirement if there is only one 
record.

For the "missing" fields, to me it looks like no fields were inferred because 
there are no fields with explicit values within, only self-closing tags with 
attributes. I think that's expected behavior until we revamp the schema system 
to support formats that have metadata about the fields themselves (XML tag 
attributes, e.g.). What fields/values were you expecting? Perhaps we could add 
a property to extract attributes as fields or something.

> ConvertRecord Schema Inference fails to infer complete schema, or simply fails
> --
>
> Key: NIFI-6241
> URL: https://issues.apache.org/jira/browse/NIFI-6241
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Core Framework
>Affects Versions: 1.9.2
>Reporter: David Sargrad
>Priority: Major
> Attachments: Reproduce_ConvertRecord_Shortcoming.xml, 
> image-2019-04-24-13-38-16-605.png, image-2019-04-24-13-39-36-327.png, 
> image-2019-04-24-13-41-00-704.png, image-2019-04-24-13-41-26-860.png, 
> image-2019-04-24-13-43-28-531.png, image-2019-04-24-13-43-59-706.png, 
> image-2019-04-24-17-03-10-728.png, image-2019-04-25-09-13-52-416.png, 
> image-2019-04-25-09-19-15-406.png, image-2019-04-25-09-30-08-297.png
>
>
> I've got a simple test flow as depicted below:
>  
>  
> !image-2019-04-24-13-38-16-605.png!
>  
> The input XML is:
> !image-2019-04-24-13-41-26-860.png!
>  
> The output JSON is almost correct, yet it is missing two critical fields 
> (they both show up as "null". The null fields are 
> {color:#ff}position{color} and {color:#ff}ncsmTrackData{color}. It is 
> also missing all of the attributes on fltdMessage.
>  
> !image-2019-04-24-13-41-00-704.png!
>  
> The configuration of my ConvertRecord is:
> !image-2019-04-24-13-43-28-531.png!
>  
> My XMLReader configuration is:
> !image-2019-04-24-13-43-59-706.png!
>  
>  Questions:
>  # Why are these two fields null? 
>  # Why are all the fltdMessage attributes being ignored?
> It would seem that this is a bug, or at least a major shortcoming, in the 
> schema inference capability. If there were a way for me to view the inferred 
> schema, then I could use that as a starting point. However its not clear from 
> the documentation how to view that schema.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)