Xikui Wang has posted comments on this change.

Change subject: Introduce XML Reader & Parser
......................................................................


Patch Set 6:

(1 comment)

https://asterix-gerrit.ics.uci.edu/#/c/1269/6/asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/parser/XMLFileParser.java
File 
asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/parser/XMLFileParser.java:

Line 48:             JSONObject xmlObj = XML.toJSONObject(record.toString());
> "Some information may be lost" doesn't sound ideal :) Is this aiming to be 
I guess this means it can only maintain those information that can be stored in 
JSON, as they are two different format. Thus it says 'Comments, prologs, DTDs, 
and <[ [ ]]> are ignored'. 

This was designed for CAPS message which each of the document is a small XML 
document with simple elements. Later Taewoo come to me with a application 
scenario that he needs to load xml documents into AsterixDB. So I extended this 
a little bit to fit his need.

After meeting with Taewoo today, It turns out he has a slightly different 
application scenario that he needs to map a nested element in XML document 
(e.g. <record> in <records> <record>...</record> <record> ... </record> ... 
</records> into a record. One possible change is to allow user specify the 
level they want to do mapping, but I am not sure that's general enough.

Generally, IMO there are needs for processing XML in AsterixDB, and probably 
something working (may not be perfect) can ease the pain.


-- 
To view, visit https://asterix-gerrit.ics.uci.edu/1269
To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia36101a0761973a9edb96b42d3dcc117661301da
Gerrit-PatchSet: 6
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Owner: Xikui Wang <xkk...@gmail.com>
Gerrit-Reviewer: Jenkins <jenk...@fulliautomatix.ics.uci.edu>
Gerrit-Reviewer: Taewoo Kim <wangs...@yahoo.com>
Gerrit-Reviewer: Till Westmann <ti...@apache.org>
Gerrit-Reviewer: Xikui Wang <xkk...@gmail.com>
Gerrit-Reviewer: abdullah alamoudi <bamou...@gmail.com>
Gerrit-HasComments: Yes

Reply via email to