With more than 15000 records you would be better off using a relational
database.
Although it will create more work to start with (you'll have to learn it),
it will save you a lot of work in the medium and long term.

Almost any relational database can be accessed from python.As it is just for
your own use SQLite might be the most appropiate (it has a very small
footprint) but MySQL is excellent and so are many others.

To use a relational database you might think about learning SQL. It is very
easy (especially if you you know any Boolean algebra) and is a language that
has been used almost unchanged for decades and shows every sign of staying
here for a long time. In computing it is one of the most useful things you
can learn. There is a good introductory, interactive tutorial
athttp://sqlcourse.com/

If you feel you need another abstraction layer on top of this you could look
at SQLObject <http://www.sqlobject.org/>.

Personally I would recommend that you start with MySQL<http://www.mysql.com>.
It is open source, easy to install and use, stable and fast.  But with SQL
motors you have lots of good choices.

Peter Jessop


On 12/13/06, Thomas <[EMAIL PROTECTED]> wrote:
I'm writing a program to analyse the profiles of the 15500 users of my
forum. I have the profiles as html files stored locally and I'm using
ClientForm to extract the various details from the html form in each
file.

My goal is to identify lurking spammers but also to learn how to
better spot spammers by calculating statistical correlations in the
data against known spammers.

I need advise with how to organise my data. There are 50 fields in
each profile, some fields will be much more use than others so I
though about creating say 10 files to start off with that contained
dictionaries of userid to field value. That way I'm dealing with 10 to
50 files instead of 15500.

Also, I am inexperienced with using classes but eager to learn and
wonder if they would be any help in this case.

Any advise much appreciated and thanks in advance,
Thomas
_______________________________________________
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor

_______________________________________________
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor

Reply via email to