Xikui Wang has posted comments on this change. Change subject: Introduce XML Reader & Parser ......................................................................
Patch Set 6: (1 comment) https://asterix-gerrit.ics.uci.edu/#/c/1269/6/asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/parser/XMLFileParser.java File asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/parser/XMLFileParser.java: Line 48: JSONObject xmlObj = XML.toJSONObject(record.toString()); > "Some information may be lost" doesn't sound ideal :) Is this aiming to be I guess this means it can only maintain those information that can be stored in JSON, as they are two different format. Thus it says 'Comments, prologs, DTDs, and <[ [ ]]> are ignored'. This was designed for CAPS message which each of the document is a small XML document with simple elements. Later Taewoo come to me with a application scenario that he needs to load xml documents into AsterixDB. So I extended this a little bit to fit his need. After meeting with Taewoo today, It turns out he has a slightly different application scenario that he needs to map a nested element in XML document (e.g. <record> in <records> <record>...</record> <record> ... </record> ... </records> into a record. One possible change is to allow user specify the level they want to do mapping, but I am not sure that's general enough. Generally, IMO there are needs for processing XML in AsterixDB, and probably something working (may not be perfect) can ease the pain. -- To view, visit https://asterix-gerrit.ics.uci.edu/1269 To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ia36101a0761973a9edb96b42d3dcc117661301da Gerrit-PatchSet: 6 Gerrit-Project: asterixdb Gerrit-Branch: master Gerrit-Owner: Xikui Wang <xkk...@gmail.com> Gerrit-Reviewer: Jenkins <jenk...@fulliautomatix.ics.uci.edu> Gerrit-Reviewer: Taewoo Kim <wangs...@yahoo.com> Gerrit-Reviewer: Till Westmann <ti...@apache.org> Gerrit-Reviewer: Xikui Wang <xkk...@gmail.com> Gerrit-Reviewer: abdullah alamoudi <bamou...@gmail.com> Gerrit-HasComments: Yes