[sqlite] Schema design and/or SELECT construction
Hello: I have about 3000 electronic images that I am preparing to distribute to my family members. They use a variety of operating systems, so I am providing the images on remote hard drives, with the images divided into folders based on years the images were created. All images were obtained via scanning of negatives and slides, and each image has been keyworded in EXIF format. What I want is to create a master SQLite database catalog of all the images, and I have come up with a proposed schema: TABLE main (all pertinent image data such as date, location in folder, etc.) TABLE core_keywords (id, name) --This table would only hold the names of my immediate family members, each with a corresponding id TABLE other_keyword (id,name) -- all other keywords I have thought to create two keyword fields in the main table, one to hold the ids of the core_keywords (comma separated) and one to hold the ids of the other_keywords, also comma separated. What I cannot devise is an elegant method to SELECT based on the core_keywords to achieve the following sorts: 1- Find images with a single core_keyword id, that is, only images of a single person, no other core persons in the image 2- Find images with a specific set of core_keyword ids, such as 1 and 6 or 2 and 5 and 7, etc., with no other core persons in the image The idea is to create a document with lists of all images that are exclusive to single individuals, specific pairs, etc., so that family members can easily find themselves or groups, regardless of image catalog software they use on their particular systems, which may or may not be able to perform these types of sorts. I am not asking anyone to actually write the SELECT statements for me, but rather point me toward the operands that would achieve my goal. I have read through the documentation, and I cannot seem to generate the logic in my head to SELECT WHERE core_id is only 4. If anyone has an idea on a more efficient database design, or TABLE schema, please do not hesitate to proffer your thoughts. I am hoping to have it all figured out BEFORE I load up the tables with data. (I am actually still scanning images at this stage, but trying to prepare for the next phase.) Thank you very much for your time and consideration. Craig Smith cr...@macscripter.net ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Schema design and/or SELECT construction
What's wrong with a properly normalised schema like this: create table main (id, name,...); create table keyword (id, label); create table crossref (id_main, id_keyword); The concept of core and other keywords is a bit arbitrary. What is important (i.e. core) today might not be so tomorrow. Parsing comma separated lists in a single attribute is likely to be a bother. On Mon, 2009-06-01 at 21:21 -0700, Craig Smith wrote: Hello: I have about 3000 electronic images that I am preparing to distribute to my family members. They use a variety of operating systems, so I am providing the images on remote hard drives, with the images divided into folders based on years the images were created. All images were obtained via scanning of negatives and slides, and each image has been keyworded in EXIF format. What I want is to create a master SQLite database catalog of all the images, and I have come up with a proposed schema: TABLE main (all pertinent image data such as date, location in folder, etc.) TABLE core_keywords (id, name) --This table would only hold the names of my immediate family members, each with a corresponding id TABLE other_keyword (id,name) -- all other keywords I have thought to create two keyword fields in the main table, one to hold the ids of the core_keywords (comma separated) and one to hold the ids of the other_keywords, also comma separated. What I cannot devise is an elegant method to SELECT based on the core_keywords to achieve the following sorts: 1- Find images with a single core_keyword id, that is, only images of a single person, no other core persons in the image 2- Find images with a specific set of core_keyword ids, such as 1 and 6 or 2 and 5 and 7, etc., with no other core persons in the image The idea is to create a document with lists of all images that are exclusive to single individuals, specific pairs, etc., so that family members can easily find themselves or groups, regardless of image catalog software they use on their particular systems, which may or may not be able to perform these types of sorts. I am not asking anyone to actually write the SELECT statements for me, but rather point me toward the operands that would achieve my goal. I have read through the documentation, and I cannot seem to generate the logic in my head to SELECT WHERE core_id is only 4. If anyone has an idea on a more efficient database design, or TABLE schema, please do not hesitate to proffer your thoughts. I am hoping to have it all figured out BEFORE I load up the tables with data. (I am actually still scanning images at this stage, but trying to prepare for the next phase.) Thank you very much for your time and consideration. Craig Smith cr...@macscripter.net ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Schema design and/or SELECT construction
On Mon, Jun 01, 2009 at 09:21:40PM -0700, Craig Smith scratched on the wall: Hello: I have about 3000 electronic images that I am preparing to distribute to my family members. They use a variety of operating systems, so I am providing the images on remote hard drives, with the images divided into folders based on years the images were created. All images were obtained via scanning of negatives and slides, and each image has been keyworded in EXIF format. What I want is to create a master SQLite database catalog of all the images, and I have come up with a proposed schema: TABLE main (all pertinent image data such as date, location in folder, etc.) TABLE core_keywords (id, name) --This table would only hold the names of my immediate family members, each with a corresponding id TABLE other_keyword (id,name) -- all other keywords I have thought to create two keyword fields in the main table, one to hold the ids of the core_keywords (comma separated) and one to hold the ids of the other_keywords, also comma separated. Generally, I would call this a bad idea. By most people's thinking it also breaks First Normal Form. Besides, you can't do database operations on comma separated lists. If you need a one-to-many (one pic to many keywords) then do it right and build an actual one-to-many relationship between two tables. Although, in this case, I'd say it is actually many-to-many. Might I suggest: TABLE pictures (pic_id INTEGER PRIMARY KEY, filename NOT NULL UNIQUE, . ); TABLE keywords (key_id INTEGER PRIMARY KEY, is_core bool, word NOT NULL UNIQUE); TABLE pic_to_key (pic_id, key_id, PRIMARY KEY (pic_id, key_id)); INDEX pic_to_key (key_id, pick_id); -- just because Define your pictures. Define your keywords. You can flag specific keywords as core. Define which pics have which keywords. You can then use a keyword to find pictures, or use pictures to find keywords. What I cannot devise is an elegant method to SELECT based on the core_keywords to achieve the following sorts: 1- Find images with a single core_keyword id, that is, only images of a single person, no other core persons in the image Lookup the key_id. Find all the pics with that key idea. Find all the key ids for those pictures. Count them, select the pics with 1. 2- Find images with a specific set of core_keyword ids, such as 1 and 6 or 2 and 5 and 7, etc., with no other core persons in the image Same basic idea, left as an exercise to the reader. I'll offer this hint, however: This is a situation that will sometimes lead to the need to do Relational Divison, something that isn't directly supported by SQL. Have a look at Celko's article on it: http://www.dbazine.com/ofinterest/oi-articles/celko1 You'll often end up with a vertical list of key_ids (a picture joined against pic_to_key to get a list of keys) and what you need is a horizontal list to run through your logic statements. That all points to Relational Division. Or you can just move the selection logic to your applicaiton. -j -- Jay A. Kreibich J A Y @ K R E I B I.C H Our opponent is an alien starship packed with atomic bombs. We have a protractor. I'll go home and see if I can scrounge up a ruler and a piece of string. --from Anathem by Neal Stephenson ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Schema design and/or SELECT construction
Chris and Jay: Thank you both very much for your comments; that solves it. I am not a trained database designer, but I have resources on normalization; I simply neglected to consult them, thinking that this was a SELECT problem, not a design problem. Your input was just what I was hoping for. The concept of core and other keywords is a bit arbitrary. What is important (i.e. core) today might not be so tomorrow. Parsing comma separated lists in a single attribute is likely to be a bother. Generally, I would call this a bad idea. By most people's thinking it also breaks First Normal Form. Besides, you can't do database operations on comma separated lists. If you need a one-to-many (one pic to many keywords) then do it right and build an actual one-to-many relationship between two tables. Craig Smith cr...@macscripter.net ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users