Hi all,

Going to build a repository for my code snippets (and I know that
pre-existing solutions exist, but I like to roll my own to see what I
can learn.), and just wondering on how to best store it and search it.

My current planned approach is to have each file added read and
keywords extracted (i.e. variable names, libraries imported etc.) and
used as keys in a dictionary, and each key points to a list of files
containing it (files will be stored in a zip or similar).

i.e.

words = {

"imaplib": ["X.py"],
"smtplib": ["X.py", "Y.py"]

}

And if I do a search for "smtplib" or "imaplib" it'll return X.py
first, as it contains both, and Y.py second.

Is there a more efficient approach? I don't think a relational DB
really suits this sort of task at present, but I'm wondering about
search times.

I've googled for search theory, and hit c2's wiki up, but I'm having
no luck at finding applicable, understandable theory... a lot of stuff
about trees and nodes and paths through graphs, but I can't correlate
that to my problem. (I can see how graph pathfinding could relat e to
routing a connection over a network, though.)

Can anyone recommend a good introduction to the theory of searching? I
really need to take some Computer Science courses.

Regards,

Liam Clarke
_______________________________________________
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor

Reply via email to