Change in asterixdb[master]: Introduce XML Reader & Parser

2016-10-19 Thread Xikui Wang (Code Review)
Xikui Wang has posted comments on this change.

Change subject: Introduce XML Reader & Parser
..


Patch Set 6:

(1 comment)

https://asterix-gerrit.ics.uci.edu/#/c/1269/6/asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/parser/XMLFileParser.java
File 
asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/parser/XMLFileParser.java:

Line 48: JSONObject xmlObj = XML.toJSONObject(record.toString());
> "Some information may be lost" doesn't sound ideal :) Is this aiming to be 
I guess this means it can only maintain those information that can be stored in 
JSON, as they are two different format. Thus it says 'Comments, prologs, DTDs, 
and <[ [ ]]> are ignored'. 

This was designed for CAPS message which each of the document is a small XML 
document with simple elements. Later Taewoo come to me with a application 
scenario that he needs to load xml documents into AsterixDB. So I extended this 
a little bit to fit his need.

After meeting with Taewoo today, It turns out he has a slightly different 
application scenario that he needs to map a nested element in XML document 
(e.g.  in  ...  ...  ... 
 into a record. One possible change is to allow user specify the 
level they want to do mapping, but I am not sure that's general enough.

Generally, IMO there are needs for processing XML in AsterixDB, and probably 
something working (may not be perfect) can ease the pain.


-- 
To view, visit https://asterix-gerrit.ics.uci.edu/1269
To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia36101a0761973a9edb96b42d3dcc117661301da
Gerrit-PatchSet: 6
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Owner: Xikui Wang 
Gerrit-Reviewer: Jenkins 
Gerrit-Reviewer: Taewoo Kim 
Gerrit-Reviewer: Till Westmann 
Gerrit-Reviewer: Xikui Wang 
Gerrit-Reviewer: abdullah alamoudi 
Gerrit-HasComments: Yes


Change in asterixdb[master]: Introduce XML Reader & Parser

2016-10-18 Thread Till Westmann (Code Review)
Till Westmann has posted comments on this change.

Change subject: Introduce XML Reader & Parser
..


Patch Set 6:

(1 comment)

https://asterix-gerrit.ics.uci.edu/#/c/1269/6/asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/parser/XMLFileParser.java
File 
asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/parser/XMLFileParser.java:

Line 48: JSONObject xmlObj = XML.toJSONObject(record.toString());
> I found this in the org.json.XML documentation.
"Some information may be lost" doesn't sound ideal :) Is this aiming to be a 
general purpose XML ingestion method or is there a specific use-case that needs 
to be addressed?


-- 
To view, visit https://asterix-gerrit.ics.uci.edu/1269
To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia36101a0761973a9edb96b42d3dcc117661301da
Gerrit-PatchSet: 6
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Owner: Xikui Wang 
Gerrit-Reviewer: Jenkins 
Gerrit-Reviewer: Taewoo Kim 
Gerrit-Reviewer: Till Westmann 
Gerrit-Reviewer: Xikui Wang 
Gerrit-Reviewer: abdullah alamoudi 
Gerrit-HasComments: Yes


Change in asterixdb[master]: Introduce XML Reader & Parser

2016-10-17 Thread Xikui Wang (Code Review)
Xikui Wang has posted comments on this change.

Change subject: Introduce XML Reader & Parser
..


Patch Set 6:

(1 comment)

https://asterix-gerrit.ics.uci.edu/#/c/1269/6/asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/parser/XMLFileParser.java
File 
asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/parser/XMLFileParser.java:

Line 48: JSONObject xmlObj = XML.toJSONObject(record.toString());
> Is there a description of this transformation somewhere? A quick google sea
I found this in the org.json.XML documentation.

Convert a well-formed (but not necessarily valid) XML string into a JSONObject. 
Some information may be lost in this transformation because JSON is a data 
format and XML is a document format. XML uses elements, attributes, and content 
text, while JSON uses unordered collections of name/value pairs and arrays of 
values. JSON does not does not like to distinguish between elements and 
attributes. Sequences of similar elements are represented as JSONArrays. 
Content text may be placed in a "content" member. Comments, prologs, DTDs, and 
<[ [ ]]> are ignored.

One more thing is that I skipped the root element so that nested elements 
(mapped into attributes) are exposed in JSON object.


-- 
To view, visit https://asterix-gerrit.ics.uci.edu/1269
To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia36101a0761973a9edb96b42d3dcc117661301da
Gerrit-PatchSet: 6
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Owner: Xikui Wang 
Gerrit-Reviewer: Jenkins 
Gerrit-Reviewer: Taewoo Kim 
Gerrit-Reviewer: Till Westmann 
Gerrit-Reviewer: Xikui Wang 
Gerrit-Reviewer: abdullah alamoudi 
Gerrit-HasComments: Yes


Change in asterixdb[master]: Introduce XML Reader & Parser

2016-10-13 Thread Jenkins (Code Review)
Jenkins has posted comments on this change.

Change subject: Introduce XML Reader & Parser
..


Patch Set 6: Integration-Tests+1

Integration Tests Successful

https://asterix-jenkins.ics.uci.edu/job/asterix-gerrit-integration-tests/923/ : 
SUCCESS

-- 
To view, visit https://asterix-gerrit.ics.uci.edu/1269
To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia36101a0761973a9edb96b42d3dcc117661301da
Gerrit-PatchSet: 6
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Owner: Xikui Wang 
Gerrit-Reviewer: Jenkins 
Gerrit-Reviewer: Taewoo Kim 
Gerrit-HasComments: No


Change in asterixdb[master]: Introduce XML Reader & Parser

2016-10-13 Thread Jenkins (Code Review)
Jenkins has posted comments on this change.

Change subject: Introduce XML Reader & Parser
..


Patch Set 6:

Integration Tests Started 
https://asterix-jenkins.ics.uci.edu/job/asterix-gerrit-integration-tests/923/

-- 
To view, visit https://asterix-gerrit.ics.uci.edu/1269
To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia36101a0761973a9edb96b42d3dcc117661301da
Gerrit-PatchSet: 6
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Owner: Xikui Wang 
Gerrit-Reviewer: Jenkins 
Gerrit-Reviewer: Taewoo Kim 
Gerrit-HasComments: No


Change in asterixdb[master]: Introduce XML Reader & Parser

2016-10-13 Thread Jenkins (Code Review)
Jenkins has posted comments on this change.

Change subject: Introduce XML Reader & Parser
..


Patch Set 6: Integration-Tests-1

Integration Tests Failed

https://asterix-jenkins.ics.uci.edu/job/asterix-gerrit-integration-tests/919/ : 
UNSTABLE

-- 
To view, visit https://asterix-gerrit.ics.uci.edu/1269
To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia36101a0761973a9edb96b42d3dcc117661301da
Gerrit-PatchSet: 6
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Owner: Xikui Wang 
Gerrit-Reviewer: Jenkins 
Gerrit-Reviewer: Taewoo Kim 
Gerrit-HasComments: No


Change in asterixdb[master]: Introduce XML Reader & Parser

2016-10-13 Thread Jenkins (Code Review)
Jenkins has posted comments on this change.

Change subject: Introduce XML Reader & Parser
..


Patch Set 6:

Integration Tests Started 
https://asterix-jenkins.ics.uci.edu/job/asterix-gerrit-integration-tests/919/

-- 
To view, visit https://asterix-gerrit.ics.uci.edu/1269
To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia36101a0761973a9edb96b42d3dcc117661301da
Gerrit-PatchSet: 6
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Owner: Xikui Wang 
Gerrit-Reviewer: Jenkins 
Gerrit-Reviewer: Taewoo Kim 
Gerrit-HasComments: No


Change in asterixdb[master]: Introduce XML Reader & Parser

2016-10-13 Thread Jenkins (Code Review)
Jenkins has posted comments on this change.

Change subject: Introduce XML Reader & Parser
..


Patch Set 6:

Build Started 
https://asterix-jenkins.ics.uci.edu/job/asterix-gerrit-notopic/3013/

-- 
To view, visit https://asterix-gerrit.ics.uci.edu/1269
To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia36101a0761973a9edb96b42d3dcc117661301da
Gerrit-PatchSet: 6
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Owner: Xikui Wang 
Gerrit-Reviewer: Jenkins 
Gerrit-Reviewer: Taewoo Kim 
Gerrit-HasComments: No


Change in asterixdb[master]: Introduce XML Reader & Parser

2016-10-13 Thread Xikui Wang (Code Review)
Hello Jenkins,

I'd like you to reexamine a change.  Please visit

https://asterix-gerrit.ics.uci.edu/1269

to look at the new patch set (#6).

Change subject: Introduce XML Reader & Parser
..

Introduce XML Reader & Parser

1. Add a record reader for XML document.
2. Add xml parser based on XML to JSON and ADMParser.
3. Fix ASTERIX-1690: deadlock between close() and take() in FileSystemWatcher
4. Add test cases for using XML adaptor in feed and load statement.

Change-Id: Ia36101a0761973a9edb96b42d3dcc117661301da
---
A asterixdb/asterix-app/data/xml/ER.xml
A asterixdb/asterix-app/data/xml/HSA.xml
A asterixdb/asterix-app/data/xml/STA.xml
A asterixdb/asterix-app/data/xml/bigger.xml
A asterixdb/asterix-app/data/xml/small_ER.xml
A 
asterixdb/asterix-app/src/test/resources/runtimets/queries/feeds/xml-adaptor/xml-adaptor.1.ddl.aql
A 
asterixdb/asterix-app/src/test/resources/runtimets/queries/feeds/xml-adaptor/xml-adaptor.2.update.aql
A 
asterixdb/asterix-app/src/test/resources/runtimets/queries/feeds/xml-adaptor/xml-adaptor.3.sleep.aql
A 
asterixdb/asterix-app/src/test/resources/runtimets/queries/feeds/xml-adaptor/xml-adaptor.4.update.aql
A 
asterixdb/asterix-app/src/test/resources/runtimets/queries/feeds/xml-adaptor/xml-adaptor.5.query.aql
A 
asterixdb/asterix-app/src/test/resources/runtimets/queries/load/load-xml-files/load-xml-files.0.ddl.aql
A 
asterixdb/asterix-app/src/test/resources/runtimets/queries/load/load-xml-files/load-xml-files.1.update.aql
A 
asterixdb/asterix-app/src/test/resources/runtimets/queries/load/load-xml-files/load-xml-files.2.query.aql
A 
asterixdb/asterix-app/src/test/resources/runtimets/results/feeds/xml-adaptor/xml-adaptor.1.adm
A 
asterixdb/asterix-app/src/test/resources/runtimets/results/load/load-xml-files/load-xml-files.1.adm
M asterixdb/asterix-app/src/test/resources/runtimets/testsuite.xml
A 
asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/input/record/reader/stream/XMLFileRecordReader.java
A 
asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/parser/XMLFileParser.java
A 
asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/parser/factory/XMLFileParserFactory.java
M 
asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/provider/ParserFactoryProvider.java
M 
asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/provider/StreamRecordReaderProvider.java
M 
asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/util/ExternalDataConstants.java
M 
asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/util/FileSystemWatcher.java
M 
asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/util/LocalFileSystemUtils.java
M 
asterixdb/asterix-metadata/src/main/java/org/apache/asterix/metadata/feeds/FeedMetadataUtil.java
25 files changed, 794 insertions(+), 34 deletions(-)


  git pull ssh://asterix-gerrit.ics.uci.edu:29418/asterixdb 
refs/changes/69/1269/6
-- 
To view, visit https://asterix-gerrit.ics.uci.edu/1269
To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia36101a0761973a9edb96b42d3dcc117661301da
Gerrit-PatchSet: 6
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Owner: Xikui Wang 
Gerrit-Reviewer: Jenkins 
Gerrit-Reviewer: Taewoo Kim 


Change in asterixdb[master]: Introduce XML Reader & Parser

2016-10-12 Thread Jenkins (Code Review)
Jenkins has posted comments on this change.

Change subject: Introduce XML Reader & Parser
..


Patch Set 5: Integration-Tests+1

Integration Tests Successful

https://asterix-jenkins.ics.uci.edu/job/asterix-gerrit-integration-tests/897/ : 
SUCCESS

-- 
To view, visit https://asterix-gerrit.ics.uci.edu/1269
To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia36101a0761973a9edb96b42d3dcc117661301da
Gerrit-PatchSet: 5
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Owner: Xikui Wang 
Gerrit-Reviewer: Jenkins 
Gerrit-Reviewer: Taewoo Kim 
Gerrit-HasComments: No


Change in asterixdb[master]: Introduce XML Reader & Parser

2016-10-12 Thread Jenkins (Code Review)
Jenkins has posted comments on this change.

Change subject: Introduce XML Reader & Parser
..


Patch Set 5:

Integration Tests Started 
https://asterix-jenkins.ics.uci.edu/job/asterix-gerrit-integration-tests/897/

-- 
To view, visit https://asterix-gerrit.ics.uci.edu/1269
To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia36101a0761973a9edb96b42d3dcc117661301da
Gerrit-PatchSet: 5
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Owner: Xikui Wang 
Gerrit-Reviewer: Jenkins 
Gerrit-Reviewer: Taewoo Kim 
Gerrit-HasComments: No


Change in asterixdb[master]: Introduce XML Reader & Parser

2016-10-12 Thread Jenkins (Code Review)
Jenkins has posted comments on this change.

Change subject: Introduce XML Reader & Parser
..


Patch Set 5:

Build Started 
https://asterix-jenkins.ics.uci.edu/job/asterix-gerrit-notopic/2986/

-- 
To view, visit https://asterix-gerrit.ics.uci.edu/1269
To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia36101a0761973a9edb96b42d3dcc117661301da
Gerrit-PatchSet: 5
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Owner: Xikui Wang 
Gerrit-Reviewer: Jenkins 
Gerrit-Reviewer: Taewoo Kim 
Gerrit-HasComments: No


Change in asterixdb[master]: Introduce XML Reader & Parser

2016-10-12 Thread Jenkins (Code Review)
Jenkins has posted comments on this change.

Change subject: Introduce XML Reader & Parser
..


Patch Set 4:

Build Started 
https://asterix-jenkins.ics.uci.edu/job/asterix-gerrit-notopic/2982/

-- 
To view, visit https://asterix-gerrit.ics.uci.edu/1269
To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia36101a0761973a9edb96b42d3dcc117661301da
Gerrit-PatchSet: 4
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Owner: Xikui Wang 
Gerrit-Reviewer: Jenkins 
Gerrit-HasComments: No


Change in asterixdb[master]: Introduce XML Reader & Parser

2016-10-12 Thread Xikui Wang (Code Review)
Hello Jenkins,

I'd like you to reexamine a change.  Please visit

https://asterix-gerrit.ics.uci.edu/1269

to look at the new patch set (#4).

Change subject: Introduce XML Reader & Parser
..

Introduce XML Reader & Parser

1. Add a record reader for XML document.
2. Add xml parser based on XML to JSON and ADMParser.
3. Fix ASTERIX-1690: deadlock between close() and take() in FileSystemWatcher

Change-Id: Ia36101a0761973a9edb96b42d3dcc117661301da
---
A asterixdb/asterix-app/data/xml/ER.xml
A asterixdb/asterix-app/data/xml/HSA.xml
A asterixdb/asterix-app/data/xml/STA.xml
A asterixdb/asterix-app/data/xml/small_ER.xml
M asterixdb/asterix-app/src/test/resources/runtimets/only.xml
A 
asterixdb/asterix-app/src/test/resources/runtimets/queries/feeds/xml-adaptor/xml-adaptor.1.ddl.aql
A 
asterixdb/asterix-app/src/test/resources/runtimets/queries/feeds/xml-adaptor/xml-adaptor.2.update.aql
A 
asterixdb/asterix-app/src/test/resources/runtimets/queries/feeds/xml-adaptor/xml-adaptor.3.sleep.aql
A 
asterixdb/asterix-app/src/test/resources/runtimets/queries/feeds/xml-adaptor/xml-adaptor.4.update.aql
A 
asterixdb/asterix-app/src/test/resources/runtimets/queries/feeds/xml-adaptor/xml-adaptor.5.query.aql
A 
asterixdb/asterix-app/src/test/resources/runtimets/results/feeds/xml-adaptor/xml-adaptor.1.adm
M asterixdb/asterix-app/src/test/resources/runtimets/testsuite.xml
A 
asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/input/record/reader/stream/XMLFileRecordReader.java
A 
asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/parser/XMLFileParser.java
A 
asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/parser/factory/XMLFileParserFactory.java
M 
asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/provider/ParserFactoryProvider.java
M 
asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/provider/StreamRecordReaderProvider.java
M 
asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/util/ExternalDataConstants.java
M 
asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/util/FileSystemWatcher.java
M 
asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/util/LocalFileSystemUtils.java
M 
asterixdb/asterix-metadata/src/main/java/org/apache/asterix/metadata/feeds/FeedMetadataUtil.java
21 files changed, 529 insertions(+), 34 deletions(-)


  git pull ssh://asterix-gerrit.ics.uci.edu:29418/asterixdb 
refs/changes/69/1269/4
-- 
To view, visit https://asterix-gerrit.ics.uci.edu/1269
To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia36101a0761973a9edb96b42d3dcc117661301da
Gerrit-PatchSet: 4
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Owner: Xikui Wang 
Gerrit-Reviewer: Jenkins