At 02:14 AM 12/14/2005, Liam Clarke wrote: >Hi all, > >Just contemplating. > >If in Python I were organising a data index along the lines of - > >j = { > >"k_word1" : ["rec1","rec2","rec3","rec4"], >... >"k_wordn" :["recX","rec4"] > >} > >and I was going to find records that matched by seeing what record >occurred in the most lists (via set intersections or similar; going to >have a play see what works faster) selected by searching keywords... > >how easily does that translate to a SQL table and query format?
Data modeling looks for relationships between objects. Relationships can be 1-1 1-many or many-many. Your case is a many-many (each keyword may appear in one or more records, and each record may contain one or more keywords.) The customary way to represent this in a relational database 3 tables. One with one row per keyword, one with one row per record and one "junction" or "association" table with one row for each keyword-record pair. KEYWORD TABLE kid keyword 1 cat 2 dog 3 mouse 4 bird 5 banana RECORD TABLE rid record 1 rexX 2 rec4 3 recAB 4 rec99 5 recFoo KEYWORD-RECORD TABLE kid rid 1 1 1 3 1 4 2 2 3 5 4 1 5 3 For processing things like this nothing IMHO beats a relational database and SQL. With many databases accessible from Python I strongly suggest this approach. SQLite is especially attractive. [snip] _______________________________________________ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor