madIS is an extensible relational database system built upon the SQLite
database and with extensions written in Python (via APSW SQLite
wrapper). Its is developed at:

http://madis.googlecode.com

Due to madIS’ SQLite core, the database format of madIS is exactly the
same as SQLite’s one. This means that all SQLite database files are directly usable with madIS.

In usage, madIS, feels like a lightweight personal Hive+Hadoop programming environment, without the distributed processing capabilities of Hadoop. Nevertheless due to its low overhead while running on a single computer (compared to Hadoop), madIS can easily handle tens of millions of rows on a single desktop/laptop computer.

In version 1.4 of madIS:

- XMLPARSE can now work without any provided prototype, producing JSON dicts which contain all parsed XML data. XMLPARSE accepts now Jdicts and Jlists in addition to XML snippets as prototypes. - FILE works with gzip compressed files, HTTP and FTP streams directly in a streaming way. - New functions which work with Jdicts were added (jdictkeys, jdictvals, jdictsplit, jdictgroupunion).
 - APACHELOGSPLIT parses and splits Apache log lines.
- Optimizations in Virtual Tables (up to 3 times faster). XMLPARSE is up to 2x faster (using fast:1 switch).

About XMLPARSE:

XMLPARSE can now work without any prototype. In this mode it produces JSON dict entries containing all the paths:data below the provided root tag:

-- Example
mterm> select * from (XMLPARSE root:entry FILE 'http://code.google.com/feeds/p/madis/hgchanges/basic') limit 1;

{"entry/updated":"2011-12-09T17:18:43Z","entry/id":"http://code.google.com/feeds/p/madis/hgchanges/basic/92bc61a496b3a34c21c5aed6d9c6cde5ac63121e","entry/link/@/href":"http://code.google.com/p/madis/source/detail?r=92bc61a496b3a34c21c5aed6d9c6cde5ac63121e","entry/link/@/type":"text/html","entry/link/@/rel":"alternate","entry/title":"Revision 92bc61a496: Fixed help formatting","entry/author/name":"est...@servum","entry/content/@/type":"html","entry/content":"Changed Paths:<br/>\n Modify /src/functions/vtable/xmlparse.py\n \n <br/>\n <br/>Fixed help formatting"}
--

If one wishes to find the all the paths that appear in above feed he could use the jgroupunion aggregate function, producing the union of all XML paths:

-- Example
mterm> select JGROUPUNION(c1) from (XMLPARSE root:entry FILE 'http://code.google.com/feeds/p/madis/hgchanges/basic');

["entry/updated","entry/id","entry/link/@/href","entry/link/@/type","entry/link/@/rel","entry/title","entry/author/name","entry/content/@/type","entry/content"]
--

If one wishes to find the common set of XML paths that appear in all of ATOM feed's entries then he could do:

--Example
mterm> select JGROUPINTERSECTION(c1) from (XMLPARSE root:entry FILE 'http://code.google.com/feeds/p/madis/hgchanges/basic');

["entry/updated","entry/id","entry/link/@/href","entry/link/@/type","entry/link/@/rel","entry/title","entry/author/name","entry/content/@/type","entry/content"]
-- Note: Intersection in this example is the same as union --

Finally, to output in a tabular form the contents of the ATOM feed, one simply has to provide the list of paths as a parameter to XMLPARSE:

--Example
mterm> select * from (
XMLPARSE root:entry
'["entry/updated","entry/id","entry/link/@/href","entry/link/@/type","entry/link/@/rel","entry/title","entry/author/name","entry/content/@/type","entry/content"]'
  FILE 'http://code.google.com/feeds/p/madis/hgchanges/basic')
  limit 1;

[1|2011-12-09T17:18:43Z
[2|http://code.google.com/feeds/p/madis/hgchanges/basic/92bc61a496b3a34c21c5aed6d9c6cde5ac63121e
[3|http://code.google.com/p/madis/source/detail?r=92bc61a496b3a34c21c5aed6d9c6cde5ac63121e
[4|text/html
[5|alternate
[6|Revision 92bc61a496: Fixed help formatting
[7|est...@servum
[8|html
[9|Changed Paths:<br/>
     Modify    /src/functions/vtable/xmlparse.py

 <br/>
 <br/>Fixed help formatting
--- [0|Column names ---
[1|updated [2|id [3|link_href [4|link_type [5|link_rel [6|title [7|author_name [8|content_type [9|content


-- Lefteris
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to