Change in asterixdb[master]: Introduce XML Reader & Parser
Xikui Wang has posted comments on this change. Change subject: Introduce XML Reader & Parser .. Patch Set 6: (1 comment) https://asterix-gerrit.ics.uci.edu/#/c/1269/6/asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/parser/XMLFileParser.java File asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/parser/XMLFileParser.java: Line 48: JSONObject xmlObj = XML.toJSONObject(record.toString()); > "Some information may be lost" doesn't sound ideal :) Is this aiming to be I guess this means it can only maintain those information that can be stored in JSON, as they are two different format. Thus it says 'Comments, prologs, DTDs, and <[ [ ]]> are ignored'. This was designed for CAPS message which each of the document is a small XML document with simple elements. Later Taewoo come to me with a application scenario that he needs to load xml documents into AsterixDB. So I extended this a little bit to fit his need. After meeting with Taewoo today, It turns out he has a slightly different application scenario that he needs to map a nested element in XML document (e.g. in ... ... ... into a record. One possible change is to allow user specify the level they want to do mapping, but I am not sure that's general enough. Generally, IMO there are needs for processing XML in AsterixDB, and probably something working (may not be perfect) can ease the pain. -- To view, visit https://asterix-gerrit.ics.uci.edu/1269 To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ia36101a0761973a9edb96b42d3dcc117661301da Gerrit-PatchSet: 6 Gerrit-Project: asterixdb Gerrit-Branch: master Gerrit-Owner: Xikui WangGerrit-Reviewer: Jenkins Gerrit-Reviewer: Taewoo Kim Gerrit-Reviewer: Till Westmann Gerrit-Reviewer: Xikui Wang Gerrit-Reviewer: abdullah alamoudi Gerrit-HasComments: Yes
Change in asterixdb[master]: Introduce XML Reader & Parser
Till Westmann has posted comments on this change. Change subject: Introduce XML Reader & Parser .. Patch Set 6: (1 comment) https://asterix-gerrit.ics.uci.edu/#/c/1269/6/asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/parser/XMLFileParser.java File asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/parser/XMLFileParser.java: Line 48: JSONObject xmlObj = XML.toJSONObject(record.toString()); > I found this in the org.json.XML documentation. "Some information may be lost" doesn't sound ideal :) Is this aiming to be a general purpose XML ingestion method or is there a specific use-case that needs to be addressed? -- To view, visit https://asterix-gerrit.ics.uci.edu/1269 To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ia36101a0761973a9edb96b42d3dcc117661301da Gerrit-PatchSet: 6 Gerrit-Project: asterixdb Gerrit-Branch: master Gerrit-Owner: Xikui WangGerrit-Reviewer: Jenkins Gerrit-Reviewer: Taewoo Kim Gerrit-Reviewer: Till Westmann Gerrit-Reviewer: Xikui Wang Gerrit-Reviewer: abdullah alamoudi Gerrit-HasComments: Yes
Change in asterixdb[master]: Introduce XML Reader & Parser
Xikui Wang has posted comments on this change. Change subject: Introduce XML Reader & Parser .. Patch Set 6: (1 comment) https://asterix-gerrit.ics.uci.edu/#/c/1269/6/asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/parser/XMLFileParser.java File asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/parser/XMLFileParser.java: Line 48: JSONObject xmlObj = XML.toJSONObject(record.toString()); > Is there a description of this transformation somewhere? A quick google sea I found this in the org.json.XML documentation. Convert a well-formed (but not necessarily valid) XML string into a JSONObject. Some information may be lost in this transformation because JSON is a data format and XML is a document format. XML uses elements, attributes, and content text, while JSON uses unordered collections of name/value pairs and arrays of values. JSON does not does not like to distinguish between elements and attributes. Sequences of similar elements are represented as JSONArrays. Content text may be placed in a "content" member. Comments, prologs, DTDs, and <[ [ ]]> are ignored. One more thing is that I skipped the root element so that nested elements (mapped into attributes) are exposed in JSON object. -- To view, visit https://asterix-gerrit.ics.uci.edu/1269 To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ia36101a0761973a9edb96b42d3dcc117661301da Gerrit-PatchSet: 6 Gerrit-Project: asterixdb Gerrit-Branch: master Gerrit-Owner: Xikui WangGerrit-Reviewer: Jenkins Gerrit-Reviewer: Taewoo Kim Gerrit-Reviewer: Till Westmann Gerrit-Reviewer: Xikui Wang Gerrit-Reviewer: abdullah alamoudi Gerrit-HasComments: Yes
Change in asterixdb[master]: Introduce XML Reader & Parser
Jenkins has posted comments on this change. Change subject: Introduce XML Reader & Parser .. Patch Set 6: Integration-Tests+1 Integration Tests Successful https://asterix-jenkins.ics.uci.edu/job/asterix-gerrit-integration-tests/923/ : SUCCESS -- To view, visit https://asterix-gerrit.ics.uci.edu/1269 To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ia36101a0761973a9edb96b42d3dcc117661301da Gerrit-PatchSet: 6 Gerrit-Project: asterixdb Gerrit-Branch: master Gerrit-Owner: Xikui WangGerrit-Reviewer: Jenkins Gerrit-Reviewer: Taewoo Kim Gerrit-HasComments: No
Change in asterixdb[master]: Introduce XML Reader & Parser
Jenkins has posted comments on this change. Change subject: Introduce XML Reader & Parser .. Patch Set 6: Integration Tests Started https://asterix-jenkins.ics.uci.edu/job/asterix-gerrit-integration-tests/923/ -- To view, visit https://asterix-gerrit.ics.uci.edu/1269 To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ia36101a0761973a9edb96b42d3dcc117661301da Gerrit-PatchSet: 6 Gerrit-Project: asterixdb Gerrit-Branch: master Gerrit-Owner: Xikui WangGerrit-Reviewer: Jenkins Gerrit-Reviewer: Taewoo Kim Gerrit-HasComments: No
Change in asterixdb[master]: Introduce XML Reader & Parser
Jenkins has posted comments on this change. Change subject: Introduce XML Reader & Parser .. Patch Set 6: Integration-Tests-1 Integration Tests Failed https://asterix-jenkins.ics.uci.edu/job/asterix-gerrit-integration-tests/919/ : UNSTABLE -- To view, visit https://asterix-gerrit.ics.uci.edu/1269 To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ia36101a0761973a9edb96b42d3dcc117661301da Gerrit-PatchSet: 6 Gerrit-Project: asterixdb Gerrit-Branch: master Gerrit-Owner: Xikui WangGerrit-Reviewer: Jenkins Gerrit-Reviewer: Taewoo Kim Gerrit-HasComments: No
Change in asterixdb[master]: Introduce XML Reader & Parser
Jenkins has posted comments on this change. Change subject: Introduce XML Reader & Parser .. Patch Set 6: Integration Tests Started https://asterix-jenkins.ics.uci.edu/job/asterix-gerrit-integration-tests/919/ -- To view, visit https://asterix-gerrit.ics.uci.edu/1269 To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ia36101a0761973a9edb96b42d3dcc117661301da Gerrit-PatchSet: 6 Gerrit-Project: asterixdb Gerrit-Branch: master Gerrit-Owner: Xikui WangGerrit-Reviewer: Jenkins Gerrit-Reviewer: Taewoo Kim Gerrit-HasComments: No
Change in asterixdb[master]: Introduce XML Reader & Parser
Jenkins has posted comments on this change. Change subject: Introduce XML Reader & Parser .. Patch Set 6: Build Started https://asterix-jenkins.ics.uci.edu/job/asterix-gerrit-notopic/3013/ -- To view, visit https://asterix-gerrit.ics.uci.edu/1269 To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ia36101a0761973a9edb96b42d3dcc117661301da Gerrit-PatchSet: 6 Gerrit-Project: asterixdb Gerrit-Branch: master Gerrit-Owner: Xikui WangGerrit-Reviewer: Jenkins Gerrit-Reviewer: Taewoo Kim Gerrit-HasComments: No
Change in asterixdb[master]: Introduce XML Reader & Parser
Hello Jenkins, I'd like you to reexamine a change. Please visit https://asterix-gerrit.ics.uci.edu/1269 to look at the new patch set (#6). Change subject: Introduce XML Reader & Parser .. Introduce XML Reader & Parser 1. Add a record reader for XML document. 2. Add xml parser based on XML to JSON and ADMParser. 3. Fix ASTERIX-1690: deadlock between close() and take() in FileSystemWatcher 4. Add test cases for using XML adaptor in feed and load statement. Change-Id: Ia36101a0761973a9edb96b42d3dcc117661301da --- A asterixdb/asterix-app/data/xml/ER.xml A asterixdb/asterix-app/data/xml/HSA.xml A asterixdb/asterix-app/data/xml/STA.xml A asterixdb/asterix-app/data/xml/bigger.xml A asterixdb/asterix-app/data/xml/small_ER.xml A asterixdb/asterix-app/src/test/resources/runtimets/queries/feeds/xml-adaptor/xml-adaptor.1.ddl.aql A asterixdb/asterix-app/src/test/resources/runtimets/queries/feeds/xml-adaptor/xml-adaptor.2.update.aql A asterixdb/asterix-app/src/test/resources/runtimets/queries/feeds/xml-adaptor/xml-adaptor.3.sleep.aql A asterixdb/asterix-app/src/test/resources/runtimets/queries/feeds/xml-adaptor/xml-adaptor.4.update.aql A asterixdb/asterix-app/src/test/resources/runtimets/queries/feeds/xml-adaptor/xml-adaptor.5.query.aql A asterixdb/asterix-app/src/test/resources/runtimets/queries/load/load-xml-files/load-xml-files.0.ddl.aql A asterixdb/asterix-app/src/test/resources/runtimets/queries/load/load-xml-files/load-xml-files.1.update.aql A asterixdb/asterix-app/src/test/resources/runtimets/queries/load/load-xml-files/load-xml-files.2.query.aql A asterixdb/asterix-app/src/test/resources/runtimets/results/feeds/xml-adaptor/xml-adaptor.1.adm A asterixdb/asterix-app/src/test/resources/runtimets/results/load/load-xml-files/load-xml-files.1.adm M asterixdb/asterix-app/src/test/resources/runtimets/testsuite.xml A asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/input/record/reader/stream/XMLFileRecordReader.java A asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/parser/XMLFileParser.java A asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/parser/factory/XMLFileParserFactory.java M asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/provider/ParserFactoryProvider.java M asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/provider/StreamRecordReaderProvider.java M asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/util/ExternalDataConstants.java M asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/util/FileSystemWatcher.java M asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/util/LocalFileSystemUtils.java M asterixdb/asterix-metadata/src/main/java/org/apache/asterix/metadata/feeds/FeedMetadataUtil.java 25 files changed, 794 insertions(+), 34 deletions(-) git pull ssh://asterix-gerrit.ics.uci.edu:29418/asterixdb refs/changes/69/1269/6 -- To view, visit https://asterix-gerrit.ics.uci.edu/1269 To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ia36101a0761973a9edb96b42d3dcc117661301da Gerrit-PatchSet: 6 Gerrit-Project: asterixdb Gerrit-Branch: master Gerrit-Owner: Xikui WangGerrit-Reviewer: Jenkins Gerrit-Reviewer: Taewoo Kim
Change in asterixdb[master]: Introduce XML Reader & Parser
Jenkins has posted comments on this change. Change subject: Introduce XML Reader & Parser .. Patch Set 5: Integration-Tests+1 Integration Tests Successful https://asterix-jenkins.ics.uci.edu/job/asterix-gerrit-integration-tests/897/ : SUCCESS -- To view, visit https://asterix-gerrit.ics.uci.edu/1269 To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ia36101a0761973a9edb96b42d3dcc117661301da Gerrit-PatchSet: 5 Gerrit-Project: asterixdb Gerrit-Branch: master Gerrit-Owner: Xikui WangGerrit-Reviewer: Jenkins Gerrit-Reviewer: Taewoo Kim Gerrit-HasComments: No
Change in asterixdb[master]: Introduce XML Reader & Parser
Jenkins has posted comments on this change. Change subject: Introduce XML Reader & Parser .. Patch Set 5: Integration Tests Started https://asterix-jenkins.ics.uci.edu/job/asterix-gerrit-integration-tests/897/ -- To view, visit https://asterix-gerrit.ics.uci.edu/1269 To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ia36101a0761973a9edb96b42d3dcc117661301da Gerrit-PatchSet: 5 Gerrit-Project: asterixdb Gerrit-Branch: master Gerrit-Owner: Xikui WangGerrit-Reviewer: Jenkins Gerrit-Reviewer: Taewoo Kim Gerrit-HasComments: No
Change in asterixdb[master]: Introduce XML Reader & Parser
Jenkins has posted comments on this change. Change subject: Introduce XML Reader & Parser .. Patch Set 5: Build Started https://asterix-jenkins.ics.uci.edu/job/asterix-gerrit-notopic/2986/ -- To view, visit https://asterix-gerrit.ics.uci.edu/1269 To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ia36101a0761973a9edb96b42d3dcc117661301da Gerrit-PatchSet: 5 Gerrit-Project: asterixdb Gerrit-Branch: master Gerrit-Owner: Xikui WangGerrit-Reviewer: Jenkins Gerrit-Reviewer: Taewoo Kim Gerrit-HasComments: No
Change in asterixdb[master]: Introduce XML Reader & Parser
Jenkins has posted comments on this change. Change subject: Introduce XML Reader & Parser .. Patch Set 4: Build Started https://asterix-jenkins.ics.uci.edu/job/asterix-gerrit-notopic/2982/ -- To view, visit https://asterix-gerrit.ics.uci.edu/1269 To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ia36101a0761973a9edb96b42d3dcc117661301da Gerrit-PatchSet: 4 Gerrit-Project: asterixdb Gerrit-Branch: master Gerrit-Owner: Xikui WangGerrit-Reviewer: Jenkins Gerrit-HasComments: No
Change in asterixdb[master]: Introduce XML Reader & Parser
Hello Jenkins, I'd like you to reexamine a change. Please visit https://asterix-gerrit.ics.uci.edu/1269 to look at the new patch set (#4). Change subject: Introduce XML Reader & Parser .. Introduce XML Reader & Parser 1. Add a record reader for XML document. 2. Add xml parser based on XML to JSON and ADMParser. 3. Fix ASTERIX-1690: deadlock between close() and take() in FileSystemWatcher Change-Id: Ia36101a0761973a9edb96b42d3dcc117661301da --- A asterixdb/asterix-app/data/xml/ER.xml A asterixdb/asterix-app/data/xml/HSA.xml A asterixdb/asterix-app/data/xml/STA.xml A asterixdb/asterix-app/data/xml/small_ER.xml M asterixdb/asterix-app/src/test/resources/runtimets/only.xml A asterixdb/asterix-app/src/test/resources/runtimets/queries/feeds/xml-adaptor/xml-adaptor.1.ddl.aql A asterixdb/asterix-app/src/test/resources/runtimets/queries/feeds/xml-adaptor/xml-adaptor.2.update.aql A asterixdb/asterix-app/src/test/resources/runtimets/queries/feeds/xml-adaptor/xml-adaptor.3.sleep.aql A asterixdb/asterix-app/src/test/resources/runtimets/queries/feeds/xml-adaptor/xml-adaptor.4.update.aql A asterixdb/asterix-app/src/test/resources/runtimets/queries/feeds/xml-adaptor/xml-adaptor.5.query.aql A asterixdb/asterix-app/src/test/resources/runtimets/results/feeds/xml-adaptor/xml-adaptor.1.adm M asterixdb/asterix-app/src/test/resources/runtimets/testsuite.xml A asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/input/record/reader/stream/XMLFileRecordReader.java A asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/parser/XMLFileParser.java A asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/parser/factory/XMLFileParserFactory.java M asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/provider/ParserFactoryProvider.java M asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/provider/StreamRecordReaderProvider.java M asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/util/ExternalDataConstants.java M asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/util/FileSystemWatcher.java M asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/util/LocalFileSystemUtils.java M asterixdb/asterix-metadata/src/main/java/org/apache/asterix/metadata/feeds/FeedMetadataUtil.java 21 files changed, 529 insertions(+), 34 deletions(-) git pull ssh://asterix-gerrit.ics.uci.edu:29418/asterixdb refs/changes/69/1269/4 -- To view, visit https://asterix-gerrit.ics.uci.edu/1269 To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ia36101a0761973a9edb96b42d3dcc117661301da Gerrit-PatchSet: 4 Gerrit-Project: asterixdb Gerrit-Branch: master Gerrit-Owner: Xikui WangGerrit-Reviewer: Jenkins