[sqlite] Schema design and/or SELECT construction

2009-06-03 Thread Craig Smith
Hello:

I have about 3000 electronic images that I am preparing to distribute  
to my family members.  They use a variety of operating systems, so I  
am providing the images on remote hard drives, with the images divided  
into folders based on years the images were created.  All images were  
obtained via scanning of negatives and slides, and each image has been  
keyworded in EXIF format.

What I want is to create a master SQLite database catalog of all the  
images, and I have come up with a proposed schema:

TABLE main (all pertinent image data such as date, location in folder,  
etc.)
TABLE core_keywords (id, name) --This table would only hold the names  
of my immediate family members, each with a corresponding id
TABLE other_keyword (id,name) -- all other keywords

I have thought to create two keyword fields in the main table, one to  
hold the ids of the core_keywords (comma separated) and one to hold  
the ids of the other_keywords, also comma separated.  What I cannot  
devise is an elegant method to SELECT based on the core_keywords to  
achieve the following sorts:

1- Find images with a single core_keyword id, that is, only images of  
a single person, no other core persons in the image
2- Find images with a specific set of core_keyword ids, such as 1 and  
6 or 2 and 5 and 7, etc., with no other core persons in the image

The idea is to create a document with lists of all images that are  
exclusive to single individuals, specific pairs, etc., so that family  
members can easily find themselves or groups, regardless of image  
catalog software they use on their particular systems, which may or  
may not be able to perform these types of sorts.

I am not asking anyone to actually write the SELECT statements for me,  
but rather point me toward the operands that would achieve my goal.  I  
have read through the documentation, and I cannot seem to generate the  
logic in my head to SELECT WHERE core_id is only 4.

If anyone has an idea on a more efficient database design, or TABLE  
schema, please do not hesitate to proffer your thoughts.  I am hoping  
to have it all figured out BEFORE I load up the tables with data.  (I  
am actually still scanning images at this stage, but trying to prepare  
for the next phase.)

Thank you very much for your time and consideration.


Craig Smith
cr...@macscripter.net



___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Schema design and/or SELECT construction

2009-06-03 Thread Chris Peachment
What's wrong with a properly normalised schema like this:

create table main (id, name,...);

create table keyword (id, label);

create table crossref (id_main, id_keyword);

The concept of core and other keywords is a bit arbitrary.
What is important (i.e. core) today might not be so tomorrow.

Parsing comma separated lists in a single attribute is
likely to be a bother.


On Mon, 2009-06-01 at 21:21 -0700, Craig Smith wrote:
 Hello:
 
 I have about 3000 electronic images that I am preparing to distribute  
 to my family members.  They use a variety of operating systems, so I  
 am providing the images on remote hard drives, with the images divided  
 into folders based on years the images were created.  All images were  
 obtained via scanning of negatives and slides, and each image has been  
 keyworded in EXIF format.
 
 What I want is to create a master SQLite database catalog of all the  
 images, and I have come up with a proposed schema:
 
 TABLE main (all pertinent image data such as date, location in folder,  
 etc.)
 TABLE core_keywords (id, name) --This table would only hold the names  
 of my immediate family members, each with a corresponding id
 TABLE other_keyword (id,name) -- all other keywords
 
 I have thought to create two keyword fields in the main table, one to  
 hold the ids of the core_keywords (comma separated) and one to hold  
 the ids of the other_keywords, also comma separated.  What I cannot  
 devise is an elegant method to SELECT based on the core_keywords to  
 achieve the following sorts:
 
 1- Find images with a single core_keyword id, that is, only images of  
 a single person, no other core persons in the image
 2- Find images with a specific set of core_keyword ids, such as 1 and  
 6 or 2 and 5 and 7, etc., with no other core persons in the image
 
 The idea is to create a document with lists of all images that are  
 exclusive to single individuals, specific pairs, etc., so that family  
 members can easily find themselves or groups, regardless of image  
 catalog software they use on their particular systems, which may or  
 may not be able to perform these types of sorts.
 
 I am not asking anyone to actually write the SELECT statements for me,  
 but rather point me toward the operands that would achieve my goal.  I  
 have read through the documentation, and I cannot seem to generate the  
 logic in my head to SELECT WHERE core_id is only 4.
 
 If anyone has an idea on a more efficient database design, or TABLE  
 schema, please do not hesitate to proffer your thoughts.  I am hoping  
 to have it all figured out BEFORE I load up the tables with data.  (I  
 am actually still scanning images at this stage, but trying to prepare  
 for the next phase.)
 
 Thank you very much for your time and consideration.
 
 
 Craig Smith
 cr...@macscripter.net
 
 
 
 ___
 sqlite-users mailing list
 sqlite-users@sqlite.org
 http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Schema design and/or SELECT construction

2009-06-03 Thread Jay A. Kreibich
On Mon, Jun 01, 2009 at 09:21:40PM -0700, Craig Smith scratched on the wall:
 Hello:
 
 I have about 3000 electronic images that I am preparing to distribute  
 to my family members.  They use a variety of operating systems, so I  
 am providing the images on remote hard drives, with the images divided  
 into folders based on years the images were created.  All images were  
 obtained via scanning of negatives and slides, and each image has been  
 keyworded in EXIF format.
 
 What I want is to create a master SQLite database catalog of all the  
 images, and I have come up with a proposed schema:
 
 TABLE main (all pertinent image data such as date, location in folder,  
 etc.)
 TABLE core_keywords (id, name) --This table would only hold the names  
 of my immediate family members, each with a corresponding id
 TABLE other_keyword (id,name) -- all other keywords

 I have thought to create two keyword fields in the main table, one to  
 hold the ids of the core_keywords (comma separated) and one to hold  
 the ids of the other_keywords, also comma separated.

  Generally, I would call this a bad idea.  By most people's
  thinking it also breaks First Normal Form.
  
  Besides, you can't do database operations on comma separated lists.
  If you need a one-to-many (one pic to many keywords) then do it right
  and build an actual one-to-many relationship between two tables.

  Although, in this case, I'd say it is actually many-to-many.

  Might I suggest:

TABLE pictures (pic_id INTEGER PRIMARY KEY, filename NOT NULL UNIQUE, . );
TABLE keywords (key_id INTEGER PRIMARY KEY, is_core bool, word NOT NULL UNIQUE);
TABLE pic_to_key (pic_id, key_id, PRIMARY KEY (pic_id, key_id));
INDEX pic_to_key (key_id, pick_id);  -- just because

  Define your pictures.
  Define your keywords.  You can flag specific keywords as core.
  Define which pics have which keywords.

  You can then use a keyword to find pictures, or use pictures to find
  keywords.

 What I cannot  
 devise is an elegant method to SELECT based on the core_keywords to  
 achieve the following sorts:
 
 1- Find images with a single core_keyword id, that is, only images of  
 a single person, no other core persons in the image

  Lookup the key_id.  Find all the pics with that key idea.  Find all
  the key ids for those pictures.  Count them, select the pics with 1.

 2- Find images with a specific set of core_keyword ids, such as 1 and  
 6 or 2 and 5 and 7, etc., with no other core persons in the image

  Same basic idea, left as an exercise to the reader.

  I'll offer this hint, however:

  This is a situation that will sometimes lead to the need to do
  Relational Divison, something that isn't directly supported by SQL.
  Have a look at Celko's article on it:
  http://www.dbazine.com/ofinterest/oi-articles/celko1

  You'll often end up with a vertical list of key_ids (a picture
  joined against pic_to_key to get a list of keys) and what you need is
  a horizontal list to run through your logic statements.  That all
  points to Relational Division.

  Or you can just move the selection logic to your applicaiton.


   -j

-- 
Jay A. Kreibich  J A Y  @  K R E I B I.C H 

Our opponent is an alien starship packed with atomic bombs.  We have
 a protractor.   I'll go home and see if I can scrounge up a ruler
 and a piece of string.  --from Anathem by Neal Stephenson
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Schema design and/or SELECT construction

2009-06-03 Thread Craig Smith
Chris and Jay:

Thank you both very much for  your comments; that solves it.  I am not  
a trained database designer, but I have resources on normalization; I  
simply neglected to consult them, thinking that this was a SELECT  
problem, not a design problem.  Your input was just what I was hoping  
for.

 The concept of core and other keywords is a bit arbitrary.
 What is important (i.e. core) today might not be so tomorrow.

 Parsing comma separated lists in a single attribute is
 likely to be a bother.

  Generally, I would call this a bad idea.  By most people's
  thinking it also breaks First Normal Form.

  Besides, you can't do database operations on comma separated lists.
  If you need a one-to-many (one pic to many keywords) then do it right
  and build an actual one-to-many relationship between two tables.

Craig Smith
cr...@macscripter.net



___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users