David Blomstrom <[EMAIL PROTECTED]> wrote on 08/26/2005 01:32:08 AM:
> (Sorry if this is a repeat; I think my first post > didn't go through.) > > I'm still stuck on the problem I asked about a day or > two ago. I'm working on a page at > http://www.geozoo.org/stacks/ that draws data from a > table that lists animal taxons (orders, families, > species, etc.) in a child-parent relationship. > > It works exactly the way it should. Try > http://www.geozoo.org/stacks/Animalia, watching the > navigation links and the column on the right, for > example. The problem is that the children of genera - > species - are properly displayed as TWO WORDS - the > child (species) and parent (genus). > > http://www.geozoo.org/stacks/Canis illustrates the > problem I run into when I tweak my PHP so that Canis > lupus is displayed instead of just lupus, for example. > If you click Canis lupus or type in > http://www.geozoo.org/stacks/Canis_lupus, you get a > 404 Page Not Found Error. > > I want it to work like this page: > > http://animaldiversity.ummz.umich.e...anis_lupus.html > > Notice that the parent displays with just one > variable: > > http://animaldiversity.ummz.umich.e...tion/Canis.html > > But I don't think I'm ever going to figure this out > until I encounter someone who already has a similar > script up and running. In the meantime, I had another > idea. Suppose I create a new table field that lists > the full species name, including an underscore. > > For example, genera and species look something like > this in my current table: > > NAME | PARENT > Canis | Canidae > lupus | Canis > Panthera | Felidae > leo | Panthera > Home | Pongidae > sapiens | Homo > > My new table might look like this: > > NEWNAME | NAME | PARENT > Canis | Canis | Canidae > Canis_lupus | lupus | Canis > Panthera | Panthera | Felidae > Panthera_leo | leo | Panthera > Homo | Homo | Pongidae > Homo_sapiens | sapiens | Homo > > So instead of displaying Parent + Name (Homo sapiens) > and adding an underscore, I just display NewName > (Homo_sapiens). > > I'm just wondering if there's anything I need to know > about using underscores in database tables. I assume I > can manipulate the underscore with PHP and/or Apache > mod_rewrite, if necessary. > > Thanks. > > To answer your direct question: You can use underscores in your data. MySQL has no problems treating them like any other text character. If you are desperate to conditionally add underscores to your artificially generated URLs, wouldn't it be just as easy to add a `taxon` field to your hierarchy table to show which level of the taxonomy you are working with? That way you can key off of the value 'species' and display that row as "parent_name" instead of just "name"? The reverse could be true, if the URL has an underscore in it, you want to look for a taxon of the type "species" and split the URL into it's genus and species components. NAME | PARENT | TAXON Canis | Canidae | genus lupus | Canis | species Panthera | Felidae | genus leo | Panthera | species Home | Pongidae | genus sapiens | Homo | species I have a better idea: Fix your database design and quit trying to make it look like your front-end. Please, please! read up on normalization. The way I would probably handle this design issue is by creating a separate table of taxonomic names (I know this is very different than my previous advice but I have had longer to think about it). ----------------- TAXONS ----------------- ID | LEVEL | NAME 1 | kingdom | Animalia 2 | genus | Canidae 3 | species | leo 4 | family | Felidae 5 | genus | Panthera 6 | family | Pongidae 7 | genus | Homo 8 | species | erectus 9 | species | sapiens This table is important because it assigns a unique ID value to each name. Next I would create an ANIMALS table. Each member of the animal kingdom can be uniquely identified by a combination of (genus,species) so the table would look something like this: ------------------------ ANIMAL ------------------------ ID | genustaxon_id | speciestaxon_id | commonname | ... other fields about the animal 1 | 7 | 9 | Modern humans | ... The PK would be the ID field a UNIQUE index would be on (genustaxon_id,speciestaxon_id) to prevent duplication a second UNIQUE index would probably also be defined on (commonname) to also guard against duplicates there as well. But these taxons also exist in a hierarchy. Using a modified preordered tree traversal model similar to the one you are trying to implement as a guide ("Storing Hierarchical Data in a Database" by Gils Van Tulder - http://www.sitepoint.com/article/hierarchical-data-database/2). I would recommend a tree structure like this: ------------------------ TAXONTREE ------------------------ nodeID | parentnode_id | leftposition | rightposition | taxon_ID | animal_id animal_id would be NULL for all nodes where taxon_id didn't point to an actual species item. For those cases where you define a slot for an animal but haven't actually created an ANIMAL record, use a zero for animal_id. For the case of the species "leo", where at least two animals share that species name, you would end up with entries _could_ look something like this: 3 = the ID of the row in the TAXONS table that contains the value "leo" 2346 = the ID of the row in the ANIMALS table for the Water dog or "Canis leo" 19768 = the ID of the row in the ANIMALS table for the African lion or "Panthera leo" nodeID | parentnode_id | leftposition | rightposition | taxon_ID | animal_id ... 456 | 300 | 456 | 512 | 3 | 2346 ... 1934 | 1899 | 1934 | 1966 | 3 | 19768 ... Now the reason for reducing this table to all numbers is three-fold: 1) Numbers take up the least amount of space of any data type. That means that more indexes and more table data can fit into memory at any one time. Time spent reading and writing data to and from the hard drives is the leading cause of performance degradation especially if you are using a computer with a fast CPU. The less time you spend accessing the disks, the faster your database will perform. 2) Numbers compare to numbers several times (6-50 times) faster than words compare to words. Faster compares = faster searches = better application performance 3) Changing the spelling of any animal's name, any taxon, or any other "word" will NOT break the relationships established in the TAXONTREE table. This provides data reliability and program stability. Why did I create an ID column for the tree when it wasn't discussed in the sitepoint article? So that each node will be uniquely identifiable. Period. Does that help you see what the other responses have been suggesting? Does that help to drive home the point that what shape the data needs to be in has little to NOTHING to do with how you design your user interfaces? In 99% of the cases, an efficient database design will perform magnitudes better than one that is poorly designed. With an efficient database on the back end, it is generally much easier to design an efficient front-end to handle user interaction. Don't try to keep the two in synch, it usually never works and your performance will be totally pear-shaped. In a data-driven application like this, the data comes first. How you work with the data (your PHP code) needs to adjust to your data storage model, not the other way around. Respectfully, Shawn Green Database Administrator Unimin Corporation - Spruce Pine