[
https://issues.apache.org/jira/browse/NIFI-7697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Pierre Villard resolved NIFI-7697.
----------------------------------
Resolution: Feedback Received
Apache NiFi 1.x is no longer maintained and no new release is planned on the
1.x release line. Marking as resolved as part of a cleanup operation. Please
open a new one with an updated description if this is still relevant for NiFi
2.x.
> NiFi XMLReader Record Component sometimes ignores empty XML Elements
> --------------------------------------------------------------------
>
> Key: NIFI-7697
> URL: https://issues.apache.org/jira/browse/NIFI-7697
> Project: Apache NiFi
> Issue Type: Bug
> Components: Core Framework
> Affects Versions: 1.11.4
> Environment: Windows 10
> Reporter: Andrew Chafos
> Priority: Major
> Labels: ControllerService, Processor, Record
>
> I am currently developing a processor for Apache NiFi that is contingent upon
> being configured with an implementation of RecordReaderFactory that produces
> well-formed NiFi Records based on input data.
> The JsonTreeReader component produced accurate results for all of my test
> cases. However, I noticed that, at least with the default configuration, the
> XMLReader component sometimes seems to mishandle data; namely, empty XML
> elements that are sub-children of XML elements that are represented as Arrays
> in NiFi Records.
> This occurs when I test using the standard ConvertRecord NiFi Processor and
> set the Reader to XMLReader and the Writer to JsonRecordSetWriter.
> These first 2 test cases work as expected:
> *Test Case 1:*
> Input XML:
> {code:xml}
> <?xml version="1.0" encoding="UTF-8"?>
> <Root>
> <DataArr>SomeData</DataArr>
> <DataArr>
> <Field>
> <NonEmptyField>2</NonEmptyField>
> </Field>
> </DataArr>
> </Root>
> {code}
> Output Json:
> {code:json}
> [
> {
> "DataArr":[
> "SomeData",
> "MapRecord[{Field=MapRecord[{NonEmptyField=2}]}]"
> ]
> }
> ]
> {code}
> *Test Case 2:*
> Input XML:
> {code:xml}
> <?xml version="1.0" encoding="UTF-8"?>
> <Root>
> <SomeData />
> <MoreData>2</MoreData>
> </Root>
> {code}
> Output Json:
> {code:json}
> [
> {
> "SomeData":null,
> "MoreData":2
> }
> ]
> {code}
> However, the following does *not* work as expected:
> *Test Case 3:*
> Input XML:
> {code:xml}
> <Root>
> <DataArr>SomeData</DataArr>
> <DataArr>
> <Field>
> <EmptyField/>
> </Field>
> </DataArr>
> </Root>
> {code}
> Output Json:
> {code:json}
> [
> {
> "DataArr":[
> "SomeData"
> ]
> }
> ]
> {code}
> It is critical for the functioning of my Processor that Field and EmptyField
> appear in this Json output for Test Case 3, and for all other inputs
> analogous to this case.
> I have tried to supply a custom NiFi RecordSchema to the components and
> verified it was being used, but I got the same results.
> Is there a way to configure these controllers such that this empty field is
> not ignored, or is this a bug in the XMLReader component?
> You can get these results from running this processor as described on NiFi,
> but you can also run this JUnit test with testXml swapped out with the
> particular test case:
> {code:java}
> import org.apache.nifi.controller.ControllerService;
> import org.apache.nifi.json.JsonRecordSetWriter;
> import org.apache.nifi.processor.Relationship;
> import org.apache.nifi.processors.standard.ConvertRecord;
> import org.apache.nifi.reporting.InitializationException;
> import org.apache.nifi.util.MockFlowFile;
> import org.apache.nifi.util.TestRunner;
> import org.apache.nifi.util.TestRunners;
> import org.apache.nifi.xml.XMLReader;
> import org.junit.Test;
> public class TestNiFiMinimal {
> @Test
> public void testEmptyXMLGetsProcessed() throws InitializationException {
> ConvertRecord convertRecord = new ConvertRecord();
> TestRunner testRunner = TestRunners.newTestRunner(convertRecord);
> ControllerService xmlReader = new XMLReader();
> testRunner.addControllerService("xmlReader", xmlReader);
> testRunner.enableControllerService(xmlReader);
> testRunner.setProperty("record-reader", "xmlReader");
> ControllerService jsonWriter = new JsonRecordSetWriter();
> testRunner.addControllerService("jsonWriter", jsonWriter);
> testRunner.enableControllerService(jsonWriter);
> testRunner.setProperty("record-writer", "jsonWriter");
> String testXml = "<?xml version='1.0'
> encoding='UTF-8'?><Root><DataArr>SomeData</DataArr><DataArr><Field><EmptyField/></Field></DataArr></Root>";
> testRunner.enqueue(testXml);
> testRunner.run();
> Relationship success =
> convertRecord.getRelationships().stream().filter(relationship ->
> relationship.getName().equals("success")).findAny().get();
> testRunner.assertAllFlowFilesTransferred(success);
> final MockFlowFile original =
> testRunner.getFlowFilesForRelationship(success).get(0);
> original.assertContentEquals("");
> }
> }
> {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)