On 04/17/2007 04:18 AM, Fabian Köhler wrote:
Thanks for all the input. The problem i have with this idea:

Really? Wow, my opinion is that you're trying to do in one table what you should do in two. Have a questions table, and an answers table. The answers table would have a column specifying which question they belong to. i.e.

QUESTIONS
int autoincrement questionID
question

ANSWERS
int autoincrement answerID
int questionID
answer

is, that
1) i must define the column with the value of the answer as a TEXT to
cover all possible answers, altough a SMALLINT might fit it better. but
i can live with that
Not a big deal. Any performance increase will be marginal, and the ease of having a standard column type in my opinion outweighs any performance decrease.
2) i think it's slow when i have lots of data. i'm currently not at home
to do performance tests, but let's assume i have 2000 questions, then i
collect by user input 2000 answers to every question by 10000 different
users (a realistic scenario for our application) and store it in a table
like this:


ANSWERS
int autoincrement answerID
int userID
int questionID
text answer
i then have 2000*10000 = 20 mill. rows in ANSWERS. now let's assume i
want to select all answers to all questions for 100 random users at
once. The only solution i see here for one query is a large join which
will take really, really long to get the data even with proper indexes
or did i do sth. wrong?

How is this not a processor intensive operation no matter what? If you're collecting this much data, you have to deal with that much data, the key is to arrange is to that the database knows something about the relationships between the data (questions * answers) in this case so that you can generate efficient queries.
is there a way to provide a view on these two tables (QUESTIONS and
ANSWERS) that simulate the original table i intended, which is actually
fast even with large amounts of data?
Yes, with either a VIEW or a JOIN.
Are other dbs like oracle better on doing joins on such large tables?
I can't speak for MySQL 5.0 because it's been a while since I've done performance tests, but the old 3.x versions of MySQL would perform much better than oracle with simple joins.
So i really like the idea of having one large table with many columns
because it's really fast and i can define each column properly but i
currently see no option to go this way but with that way there will
alway be a limit to the max. amount of answers in the system.

If you're doing that, why not just make a flat-file database of the answers? That would be much faster still than dealing with the added overhead of a SQL server. You're ignoring the advantages of having a SQL setup in this case anyhow.

-Micah

Reply via email to