Re: XML to RDB

Steven Siebert Mon, 25 Nov 2013 05:32:45 -0800

Larry,

If that's the only type of queries you're gonna make, sounds like MySQL can
be a good solution (from an indexing perspective, those queries will be
fine).  Basically, you would be using MySQL as a read-only copy of the XML
content...so your goal here is to understand how you'll be needing to
query, and build the best model (in MySQL) you can to achieve these
queries.  Going back to your initial question, I don't believe there is an
organic MySQL tool (or even a 3rd party tool that I know about) that will
do this for you - you'll probably have to write something, very small, to
do this.  Not sure how much programming experience you have...but something
like a simple SAX parser producing XML element and attribute "rows" that
you (probably batch) insert into the schema you make will work quite
easily...with very little code.

Be careful with the schema, especially the IDs.  The schema suggested by
Johan will work in simple cases, but might become a problem with unbounded
elements and if you wanted to break out XML arrays, for example (ie more
than one instance of a node's ID).  If you need to preserve sequence, that
might be something you need to consider too.  If you don't foresee the need
to query the parent hierarchy, I wouldn't add it (no parent_id)...not only
would it not be necessary, you have to consider how you want to query
hierarchy...if you need to, checked out the Nested Set model as opposed to
the adjacent list (see h
ttp://mikehillyer.com/articles/managing-hierarchical-data-in-mysql/).
Again, this all depends on what you need...but based on the queries you
indicated, you wouldn't need it.

Good luck!

S

On Mon, Nov 25, 2013 at 4:29 AM, Johan De Meersman <vegiv...@tuxera.be>wrote:

>  _______________
> | Nodes         |
> |---------------|
> | id            | ---
> | [more fields] |   |
> | parent_id     | >--
> |_______________|
>
>
> And then you join the table with itself as needed. Do note that you'll
> need a self-join for every level you query, and that every self-join
> incrementally slows down the query.
>
> It may be beneficial (actually, I'm pretty sure it will be :-) ) to
> implement the recursion in software as necessary, simply follow the branch
> you want and check wether there are children at each turn; or pick up the
> child and go back up until parent_id is NULL.
>
> Depending on what you want/need, you could also add a depth field, so you
> don't have to loop to figure out where you are in the tree, etc.
>
> Storing node x/y/z as an identifier, as you suggest, is also an option
> with it's own benefits and downsides; nothing prevents you from taking any
> or all of those approaches at the same time.
>
>
> ----- Original Message -----
> > From: "Larry Martell" <larry.mart...@gmail.com>
> > To: "Steven Siebert" <smsi...@gmail.com>
> > Cc: "mysql mailing list" <mysql@lists.mysql.com>
> > Sent: Friday, 22 November, 2013 3:17:44 PM
> > Subject: Re: XML to RDB
> >
> > Yes, I will need to query the data, and yes, it's app specific to the
> data.
> > The parent-node structure will meet my needs if I can figure out how to
> > preserver the unbound multi level nodes. There will be a higher level
> > table, which will have the file name, date received, and other
> identifying
> > info that will be start of the query. The nodes will have a foreign key
> > back to the file table. The queries will be something like "get the abcd
> > parameter from node x/y/z from the file foo.xml from 10/10/13". I guess I
> > can just store the x/y/z in the node table (as opposed to trying to
> > represent z is child of y which is a child of x in 3 different tables or
> > rows.
> >
>
> --
> Unhappiness is discouraged and will be corrected with kitten pictures.
>

Re: XML to RDB

Reply via email to