On Mon, Nov 25, 2013 at 8:29 AM, Steven Siebert <smsi...@gmail.com> wrote:
> Larry, > > > If that's the only type of queries you're gonna make, sounds like MySQL > can be a good solution (from an indexing perspective, those queries will be > fine). Basically, you would be using MySQL as a read-only copy of the XML > content...so your goal here is to understand how you'll be needing to > query, and build the best model (in MySQL) you can to achieve these > queries. Going back to your initial question, I don't believe there is an > organic MySQL tool (or even a 3rd party tool that I know about) that will > do this for you - you'll probably have to write something, very small, to > do this. Not sure how much programming experience you have...but something > like a simple SAX parser producing XML element and attribute "rows" that > you (probably batch) insert into the schema you make will work quite > easily...with very little code. > I've been programing for over 30 years. I already have written some python code using ElementTree that parses the file. > Be careful with the schema, especially the IDs. The schema suggested by > Johan will work in simple cases, but might become a problem with unbounded > elements and if you wanted to break out XML arrays, for example (ie more > than one instance of a node's ID). If you need to preserve sequence, that > might be something you need to consider too. If you don't foresee the need > to query the parent hierarchy, I wouldn't add it (no parent_id)...not only > would it not be necessary, you have to consider how you want to query > hierarchy...if you need to, checked out the Nested Set model as opposed to > the adjacent list (see h > ttp://mikehillyer.com/articles/managing-hierarchical-data-in-mysql/). > Again, this all depends on what you need...but based on the queries you > indicated, you wouldn't need it. > Thanks for the link - the Nested Set model looks very interesting, but as you say, I probably won't need that. This will most likely be an iterative process as how it will be queried will probably evolve. > Good luck! > > S > > > > > On Mon, Nov 25, 2013 at 4:29 AM, Johan De Meersman <vegiv...@tuxera.be>wrote: > >> _______________ >> | Nodes | >> |---------------| >> | id | --- >> | [more fields] | | >> | parent_id | >-- >> |_______________| >> >> >> And then you join the table with itself as needed. Do note that you'll >> need a self-join for every level you query, and that every self-join >> incrementally slows down the query. >> >> It may be beneficial (actually, I'm pretty sure it will be :-) ) to >> implement the recursion in software as necessary, simply follow the branch >> you want and check wether there are children at each turn; or pick up the >> child and go back up until parent_id is NULL. >> >> Depending on what you want/need, you could also add a depth field, so you >> don't have to loop to figure out where you are in the tree, etc. >> >> Storing node x/y/z as an identifier, as you suggest, is also an option >> with it's own benefits and downsides; nothing prevents you from taking any >> or all of those approaches at the same time. >> >> >> ----- Original Message ----- >> > From: "Larry Martell" <larry.mart...@gmail.com> >> > To: "Steven Siebert" <smsi...@gmail.com> >> > Cc: "mysql mailing list" <mysql@lists.mysql.com> >> > Sent: Friday, 22 November, 2013 3:17:44 PM >> > Subject: Re: XML to RDB >> > >> > Yes, I will need to query the data, and yes, it's app specific to the >> data. >> > The parent-node structure will meet my needs if I can figure out how to >> > preserver the unbound multi level nodes. There will be a higher level >> > table, which will have the file name, date received, and other >> identifying >> > info that will be start of the query. The nodes will have a foreign key >> > back to the file table. The queries will be something like "get the abcd >> > parameter from node x/y/z from the file foo.xml from 10/10/13". I guess >> I >> > can just store the x/y/z in the node table (as opposed to trying to >> > represent z is child of y which is a child of x in 3 different tables or >> > rows. >> > >> >> -- >> Unhappiness is discouraged and will be corrected with kitten pictures. >> > >