David Blomstrom <[EMAIL PROTECTED]> wrote on 08/26/2005 01:32:08 
AM:

> (Sorry if this is a repeat; I think my first post
> didn't go through.)
> 
> I'm still stuck on the problem I asked about a day or
> two ago. I'm working on a page at
> http://www.geozoo.org/stacks/ that draws data from a
> table that lists animal taxons (orders, families,
> species, etc.) in a child-parent relationship.
> 
> It works exactly the way it should. Try
> http://www.geozoo.org/stacks/Animalia, watching the
> navigation links and the column on the right, for
> example. The problem is that the children of genera -
> species - are properly displayed as TWO WORDS - the
> child (species) and parent (genus).
> 
> http://www.geozoo.org/stacks/Canis illustrates the
> problem I run into when I tweak my PHP so that Canis
> lupus is displayed instead of just lupus, for example.
> If you click Canis lupus or type in
> http://www.geozoo.org/stacks/Canis_lupus, you get a
> 404 Page Not Found Error.
> 
> I want it to work like this page:
> 
> http://animaldiversity.ummz.umich.e...anis_lupus.html
> 
> Notice that the parent displays with just one
> variable:
> 
> http://animaldiversity.ummz.umich.e...tion/Canis.html
> 
> But I don't think I'm ever going to figure this out
> until I encounter someone who already has a similar
> script up and running. In the meantime, I had another
> idea. Suppose I create a new table field that lists
> the full species name, including an underscore.
> 
> For example, genera and species look something like
> this in my current table:
> 
> NAME | PARENT
> Canis | Canidae
> lupus | Canis
> Panthera | Felidae
> leo | Panthera
> Home | Pongidae
> sapiens | Homo
> 
> My new table might look like this:
> 
> NEWNAME | NAME | PARENT
> Canis | Canis | Canidae
> Canis_lupus | lupus | Canis
> Panthera | Panthera | Felidae
> Panthera_leo | leo | Panthera
> Homo | Homo | Pongidae
> Homo_sapiens | sapiens | Homo
> 
> So instead of displaying Parent + Name (Homo sapiens)
> and adding an underscore, I just display NewName
> (Homo_sapiens).
> 
> I'm just wondering if there's anything I need to know
> about using underscores in database tables. I assume I
> can manipulate the underscore with PHP and/or Apache
> mod_rewrite, if necessary.
> 
> Thanks.
> 
> 
To answer your direct question: You can use underscores in your data. 
MySQL has no problems treating them like any other text character.

If you are desperate to conditionally add underscores to your artificially 
generated URLs, wouldn't it be just as easy to add a `taxon` field to your 
hierarchy table to show which level of the taxonomy you are working with? 
That way you can key off of the value 'species' and display that row as 
"parent_name" instead of just "name"? The reverse could be true, if the 
URL has an underscore in it, you want to look for a taxon of the type 
"species" and split the URL into it's genus and species components.

NAME | PARENT | TAXON
Canis | Canidae | genus
lupus | Canis | species
Panthera | Felidae | genus
leo | Panthera | species
Home | Pongidae | genus
sapiens | Homo | species

I have a better idea: Fix your database design and quit trying to make it 
look like your front-end.

Please, please! read up on normalization. The way I would probably handle 
this design issue is by creating a separate table of taxonomic names (I 
know this is very different than my previous advice but I have had longer 
to think about it).

-----------------
TAXONS
-----------------
ID | LEVEL | NAME
1 | kingdom | Animalia
2 | genus | Canidae
3 | species | leo
4 | family | Felidae
5 | genus | Panthera
6 | family | Pongidae
7 | genus | Homo
8 | species | erectus
9 | species | sapiens

This table is important because it assigns a unique ID value to each name. 
Next I would create an ANIMALS table. Each member of the animal kingdom 
can be uniquely identified by a combination of (genus,species) so the 
table would look something like this:

------------------------
ANIMAL
------------------------
ID | genustaxon_id | speciestaxon_id | commonname | ... other fields about 
the animal
1 | 7 | 9 | Modern humans | ...

The PK would be the ID field
a UNIQUE index would be on (genustaxon_id,speciestaxon_id) to prevent 
duplication
a second UNIQUE index would probably also be defined on (commonname) to 
also guard against duplicates there as well.

But these taxons also exist in a hierarchy. Using a modified preordered 
tree traversal model similar to the one you are trying to implement as a 
guide ("Storing Hierarchical Data in a Database" by Gils Van Tulder - 
http://www.sitepoint.com/article/hierarchical-data-database/2). I would 
recommend a tree structure like this:

------------------------
TAXONTREE
------------------------
nodeID | parentnode_id | leftposition | rightposition | taxon_ID | 
animal_id

animal_id would be NULL for all nodes where taxon_id didn't point to an 
actual species item. For those cases where you define a slot for an animal 
but haven't actually created an ANIMAL record, use a zero for animal_id. 
For the case of the species "leo", where at least two animals share that 
species name, you would end up with entries _could_ look something like 
this:

3 = the ID of the row in the TAXONS table that contains the value "leo"
2346 = the ID of the row in the ANIMALS table for the Water dog or "Canis 
leo"
19768 = the ID of the row in the ANIMALS table for the African lion or 
"Panthera leo"

nodeID | parentnode_id | leftposition | rightposition | taxon_ID | 
animal_id
...
456 | 300 | 456 | 512 | 3 | 2346
...
1934 | 1899 | 1934 | 1966 | 3 | 19768
...

Now the reason for reducing this table to all numbers is three-fold:

1) Numbers take up the least amount of space of any data type. That means 
that more indexes and more table data can fit into memory at any one time. 
Time spent reading and writing data to and from the hard drives is the 
leading cause of performance degradation especially if you are using a 
computer with a fast CPU. The less time you spend accessing the disks, the 
faster your database will perform.
2) Numbers compare to numbers several times (6-50 times) faster than words 
compare to words. Faster compares = faster searches = better application 
performance
3) Changing the spelling of any animal's name, any taxon, or any other 
"word" will NOT break the relationships established in the TAXONTREE 
table. This provides data reliability and program stability.

Why did I create an ID column for the tree when it wasn't discussed in the 
sitepoint article? So that each node will be uniquely identifiable. 
Period. 

Does that help you see what the other responses have been suggesting? Does 
that help to drive home the point that what shape the data needs to be in 
has little to NOTHING to do with how you design your user interfaces? In 
99% of the cases, an efficient database design will perform magnitudes 
better than one that is poorly designed. With an efficient database on the 
back end, it is generally much easier to design an efficient front-end to 
handle user interaction. Don't try to keep the two in synch, it usually 
never works and your performance will be totally pear-shaped. In a 
data-driven application like this, the data comes first. How you work with 
the data (your PHP code) needs to adjust to your data storage model, not 
the other way around.

Respectfully,
Shawn Green
Database Administrator
Unimin Corporation - Spruce Pine

Reply via email to