Thanks, yes, this well help a lot.
On Wednesday, 8 July 2020 at 11:33:31 UTC+2 Simon King wrote:
> You should start by enabling SQLAlchemy logging to see the actual
> queries that are being run. The easiest way is to pass "echo=True" to
> your create_engine call. (You could also try
You should start by enabling SQLAlchemy logging to see the actual
queries that are being run. The easiest way is to pass "echo=True" to
your create_engine call. (You could also try echo="debug", but since
you've got hundreds of thousands of rows you'll be swamped)
Verify that the queries look
I'd like to redesign the DB but that's not on the cards at the moment (or
ever hehe).
Thanks for the feedback, I appreciate all the help, I really do.
What puzzles me now is, why would the looping through each student (using
with_entities) and getting the subjects (using with_entities) for each
Based on what you shared above:
* The "Subject" table is: `StudentId, SubjectCode, SubjectName`
* There are 181 subjects
It looks like you don't have a "Subject" table, but a "StudentAndSubject"
table.
I think you'd have a bigger performance improvement by normalizing that
data into two
Nope, I'm running it on a Xubuntu VM. The VM only has 4 gigs of RAM and 4
cores assigned to it.
Nope, the students table has created_on, updated_on, id, 3 integer columns
and a boolean column.
The subjects table has student_id, code and name columns. The code column
is character varying(100)
2:25 minutes, even for 1,267,000 rows, sounds like quite a long time
to me. Are you physically a long way from the database, querying
across the internet? Do the tables contain many columns, or perhaps
large BLOB or TEXT fields?
If there are large fields that you don't need very often, you could
I'm currently testing with 7000 students with 181 subjects.
I first went over to the DB to run the query directly on there. I did an
innerjoin between the two tables.
It took about 2:25 minutes. I then selected specific columns from the two
tables and the time dropped to about 2 minutes or so.
I
How long is it taking? You mentioned 2000 students with 100 subjects
each, so there are something like 200,000 rows in the Subjects table,
and you need to load all of it. I wouldn't expect that to take longer
than a couple of seconds though.
Simon
On Mon, Jul 6, 2020 at 7:34 PM Justvuur wrote:
I added the eager-loading but it seems to slow down the SQL query quite a
lot.
It's as if now, the SQL query is taking longer but the generating of the
file is quicker hehe... I guess now the queries are being fired before to
populate subjects.
It's still taking relatively the same amount of
Are you eager-loading the "student.subjects" relationship? If not,
that will give you the biggest performance increase. Without that, you
will be issuing a separate DB query for each of the students, to load
that student's subjects. Eager-loading allows you to preload the
subjects for every
Hi Simon, thanks for the help! I've never used that before, it's quite
handy.
I'm looping through all the students and printing them and their subject
details to a CSV file.
What makes things a tad complicated is the subjects must appear in a
specific order.
There is a table that has the
Are you trying to optimise the database access (ie. minimize the
number of queries), or provide a nice dictionary-style API for your
Student objects? What do you mean when you say that looping over
student.subjects is quite heavy?
An association proxy can be used to get dict-style access to a
Hi there,
I'm struggling to find an efficient way to get a two columned subset into
dictionary form.
I have an entity that has a subset of data. The subset is linked to the
entity via Id. The order of the subset of data is defined in another table.
Example:
Student - Id, firstname, lastname
13 matches
Mail list logo