Hi Ram,
Were you able to resolve this issue? I debugged this problem a little bit to find root cause of the issue. Turned out there is indeed a problem in XMLParser operator. It was because clazz variable is marked transient in the Parser super class. So, the value was null in setup call as transient clazz variable value got dropped in serialization and de-serialization. This did not get caught in unit tests, since those didn't test the application serialization part. To fix issue, the transient should be removed for clazz field in Parser class. Moreover, I think you are right about DocumentBuilder class, though I am afraid I cannot remember the reason for adding parsedOutput as output port. Could you use output port 'out' in super class as before? To give a little background on the moving this operator from malhar-contrib to library. Originally, while adding xsd schema validation, I had changed dependency from XStream to JAXB. Since, there were no additional dependencies needed for XML parser anymore, I moved the class to malhar-library. This introduced some of the issues you saw. In retrospect, I was wondering if it makes sense to revert this class to 3.2 if Xstream usage was more straight-forward. Thanks, Isha On Mon, May 9, 2016 at 10:05 AM, Munagala Ramanath <[email protected]> wrote: > Looks like *XmlParser* operator in 3.3 is broken in a couple of ways: > > 1. It uses *DocumentBuilder* and related classes but supplies the XML input > string to* DocumentBuilder.parse()*. But that method takes a File, > InputSource or URI, _not_ an XML string: > > https://docs.oracle.com/javase/7/docs/api/javax/xml/parsers/DocumentBuilder.html > 2. It overrides *setup()* and within it invokes > *JAXBContext.newInstance(getClazz());* which fails if the* clazz *field is > null; this was not the case with the 3.2 version -- still trying to figure > out why *clazz* is null even after I explicitly set it to a non-null value > in *populateDAG()*. > > I'll create a JIRA and add more details there. > > Ram > > On Sun, May 8, 2016 at 7:02 PM, Munagala Ramanath <[email protected]> > wrote: > > > Hi, > > > > I wrote a small app to exercise the XmlParser operator. The app works > fine > > with Malhar 3.2 > > but fails with 3.3 with an exception like this: > > > > java.lang.IllegalArgumentException > > at javax.xml.bind.JAXBContext.newInstance(JAXBContext.java:637) > > at javax.xml.bind.JAXBContext.newInstance(JAXBContext.java:584) > > at com.datatorrent.lib.parser.XmlParser.setup(XmlParser.java:135) > > at com.datatorrent.lib.parser.XmlParser.setup(XmlParser.java:63) > > at com.datatorrent.stram.engine.Node.setup(Node.java:161) > > at > > > com.datatorrent.stram.engine.StreamingContainer.setupNode(StreamingContainer.java:1287) > > at > > > com.datatorrent.stram.engine.StreamingContainer.access$100(StreamingContainer.java:92) > > at > > > com.datatorrent.stram.engine.StreamingContainer$2.run(StreamingContainer.java:1361) > > > > The operator has moved to the library module in 3.3 from contrib and > there > > are other changes as > > well, so I made the minor changes needed to accomodate the move but to no > > avail. I tried > > both 3.2.0 and 3.3.0 of apex-core, tried adding JAXB annotations to the > > Employee class > > but nothing seems to make any difference -- I get the same exception. > > > > My app for 3.3 (slightly different for 3.2) looks like this: > > ------------------------------------- > > *public void populateDAG(DAG dag, Configuration conf)* > > * {* > > * Gen gen = dag.addOperator("generator", new Gen());* > > > > * // configure parser* > > * XmlParser parser = dag.addOperator("parser", new XmlParser());* > > * parser.setClazz(Employee.class);* > > > > * ConsoleOutputOperator cons = dag.addOperator("console", new > > ConsoleOutputOperator());* > > > > * dag.addStream("input", gen.output, parser.in > > <http://parser.in>).setLocality(Locality.CONTAINER_LOCAL);* > > * dag.addStream("data", parser.parsedOutput, > > cons.input).setLocality(Locality.CONTAINER_LOCAL);* > > ---------------------------------------- > > > > Both versions of the project are in branch *add-xmlparse* at: > > *[email protected]:amberarrow/examples.git* > > > > Anybody know the right way to use this operator in 3.3 ? > > > > Thanks. > > > > Ram > > >
