Hi Thomas,

*this is also what I would expect. Path-based storage does rely on very
smart ways to figure match terms in a query to paths of course.*

Did you test, or is this theoretical?

Jan-Marc

Op di 26 jan. 2016 om 11:36 schreef Thomas Beale <thomas.be...@openehr.org>:

>
>
> On 26/01/2016 09:51, Bert Verhees wrote:
>
> On 26-01-16 10:38, Jan-Marc Verlinden wrote:
>
>
>    - Our first version was Java based with a postgres DB, everything
>    stored as path/values.
>    Every query would take about a second. We did not even try complex
>    queries..:-). Also the GUI side did not know what to do with the 
> pathvalues.
>
> Hi Jan-Marc,
>
> There where some problems handling the path/values, most problems were
> based on giving a semantic meaning to the paths.
> Storing path and an according a value is very, very quick. I asked
> database specialists, and they say this is the best way to go until
> billions of records.
>
>
> this is also what I would expect. Path-based storage does rely on very
> smart ways to figure match terms in a query to paths of course. There are
> some tricks to use here. For example, the path to systolic BP DV_QUANTITY
> node from the archetype is
>
> /data[id2|history|]/events[id7|any
> event|]/data[id4]/items[id5|Systolic|]/value
>
> In the whole of CKM there are probably about 7,000 'interesting' leaf
> paths (if you assume that you crunch DATA_VALUE subtypes into little
> blobs). That's a tiny number. Assume that when they've modelled everything
> in medicine (outside of genomics and proteomics) that we have 50,000 such
> 'paths of interest'. That's a very small number. These paths can be mapped
> in smart ways to a 64-but number space so that finding out if a specific
> query term is in some EHR is very quick. When you include a coded list of
> archetype ids in the mix, I think querying can be made extremly quick.
>
> The devil is in the details. Various large DBs used path-based approachs
> in the past, Informix was one.
>
>
>
> Also easy to migrate to another database, for clustering or other reasons.
>
> But there are some problems to solve, which were harder to solve five
> years ago.
>
> One problem is the GUI builders, they are looking at a difficult to
> understand database-approach, and also easy to create errors in, hard to
> debug.
> They need JSON to write their datasets in.
>
> The other problem is querying. As long as it are predefined queries, you
> can do anything, but then you are no different from an old monolithic
> system.
> But writing new templates heavily relies on on the fly query building
>
> There are however, some technological progresses, also in the open source
> domain.
>
> The path/value storage could come to a better life again with help of
> ANTLR, which can help to interpret AQL for this purpose. I even think this
> is promising.
>
> Let engineers read the Definitive ANTLR4 Reference by Terence Parr, and
> read it with path/values in the back of the mind. Both the GUI problem as
> the query problem can be solved.
>
> It should be worth the spent time and the price of the book ;-)
>
>
> It is.
>
>
> - thomas
> _______________________________________________
> openEHR-technical mailing list
> openEHR-technical@lists.openehr.org
>
> http://lists.openehr.org/mailman/listinfo/openehr-technical_lists.openehr.org

-- 

Jan-Marc Verlinden
MedVision (mobile)

-- 
*MedVision BV*
Aagje Dekenkade 71
2251 ZV, Voorschoten
www.medvision360.com

This e-mail message is intended exclusively for the addressee(s). Please 
inform us immediately if you are not the addressee. 
_______________________________________________
openEHR-technical mailing list
openEHR-technical@lists.openehr.org
http://lists.openehr.org/mailman/listinfo/openehr-technical_lists.openehr.org

Reply via email to