With more than 15000 records you would be better off using a relational database. Although it will create more work to start with (you'll have to learn it), it will save you a lot of work in the medium and long term.
Almost any relational database can be accessed from python.As it is just for your own use SQLite might be the most appropiate (it has a very small footprint) but MySQL is excellent and so are many others. To use a relational database you might think about learning SQL. It is very easy (especially if you you know any Boolean algebra) and is a language that has been used almost unchanged for decades and shows every sign of staying here for a long time. In computing it is one of the most useful things you can learn. There is a good introductory, interactive tutorial athttp://sqlcourse.com/ If you feel you need another abstraction layer on top of this you could look at SQLObject <http://www.sqlobject.org/>. Personally I would recommend that you start with MySQL<http://www.mysql.com>. It is open source, easy to install and use, stable and fast. But with SQL motors you have lots of good choices. Peter Jessop On 12/13/06, Thomas <[EMAIL PROTECTED]> wrote:
I'm writing a program to analyse the profiles of the 15500 users of my forum. I have the profiles as html files stored locally and I'm using ClientForm to extract the various details from the html form in each file. My goal is to identify lurking spammers but also to learn how to better spot spammers by calculating statistical correlations in the data against known spammers. I need advise with how to organise my data. There are 50 fields in each profile, some fields will be much more use than others so I though about creating say 10 files to start off with that contained dictionaries of userid to field value. That way I'm dealing with 10 to 50 files instead of 15500. Also, I am inexperienced with using classes but eager to learn and wonder if they would be any help in this case. Any advise much appreciated and thanks in advance, Thomas _______________________________________________ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
_______________________________________________ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor