Nope, I'm running it on a Xubuntu VM. The VM only has 4 gigs of RAM and 4 cores assigned to it. Nope, the students table has created_on, updated_on, id, 3 integer columns and a boolean column. The subjects table has student_id, code and name columns. The code column is character varying(100) and the name column is character varying.
Yes, so its not over the net, its running locally. That's a good idea! Maybe I'll try that! Thanks for your help on this, I really appreciate it. On Tuesday, 7 July 2020 15:56:04 UTC+2, Simon King wrote: > > 2:25 minutes, even for 1,267,000 rows, sounds like quite a long time > to me. Are you physically a long way from the database, querying > across the internet? Do the tables contain many columns, or perhaps > large BLOB or TEXT fields? > > If there are large fields that you don't need very often, you could > consider marking them as "deferred", then they wouldn't be loaded by > default: > > > https://docs.sqlalchemy.org/en/13/orm/loading_columns.html#deferred-column-loading > > > I suppose if your database server was underpowered, loading over a > million rows might exhaust its memory and cause it to start swapping, > which would hurt performance, in which case querying for smaller > amounts of data might be better. > > Simon > > On Tue, Jul 7, 2020 at 12:53 PM Justvuur <just...@gmail.com <javascript:>> > wrote: > > > > I'm currently testing with 7000 students with 181 subjects. > > I first went over to the DB to run the query directly on there. I did an > innerjoin between the two tables. > > It took about 2:25 minutes. I then selected specific columns from the > two tables and the time dropped to about 2 minutes or so. > > I then tried a "select subjects where student ID in (...)" query and > that executed a lot quicker than the others. > > > > So then on the python side of things, I first fetched the students and > only selected id, created_on and updated_on columns using the with_entities > option. > > That seemed to take less than 1 second. I then, for each student, > executed a DB query to fetch the subjects using the with_entities option. > > For some reason, this seems to work faster than using the > eager-loading/students.subjects/attribute_mapped_collection. They seem to > take 2 minutes longer than the above (what should be inefficient) approach. > > > > On Tuesday, 7 July 2020 at 12:50:13 UTC+2 Simon King wrote: > >> > >> How long is it taking? You mentioned 2000 students with 100 subjects > >> each, so there are something like 200,000 rows in the Subjects table, > >> and you need to load all of it. I wouldn't expect that to take longer > >> than a couple of seconds though. > >> > >> Simon > >> > >> On Mon, Jul 6, 2020 at 7:34 PM Justvuur <just...@gmail.com> wrote: > >> > > >> > I added the eager-loading but it seems to slow down the SQL query > quite a lot. > >> > It's as if now, the SQL query is taking longer but the generating of > the file is quicker hehe... I guess now the queries are being fired before > to populate subjects. > >> > It's still taking relatively the same amount of time though. > >> > > >> > > >> > On Friday, 3 July 2020 17:07:43 UTC+2, Simon King wrote: > >> >> > >> >> Are you eager-loading the "student.subjects" relationship? If not, > >> >> that will give you the biggest performance increase. Without that, > you > >> >> will be issuing a separate DB query for each of the students, to > load > >> >> that student's subjects. Eager-loading allows you to preload the > >> >> subjects for every student in a single query: > >> >> > >> >> > https://docs.sqlalchemy.org/en/13/orm/loading_relationships.html#joined-eager-loading > > >> >> > >> >> Simon > >> >> > >> >> On Fri, Jul 3, 2020 at 1:36 PM Justvuur <just...@gmail.com> wrote: > >> >> > > >> >> > Hi Simon, thanks for the help! I've never used that before, it's > quite handy. > >> >> > > >> >> > I'm looping through all the students and printing them and their > subject details to a CSV file. > >> >> > What makes things a tad complicated is the subjects must appear in > a specific order. > >> >> > There is a table that has the subject code and order number > (ordered_subjects used below is the resultset from it). > >> >> > I printed out the timing and found the problem to be with a nested > for loop. > >> >> > > >> >> > I was hoping to reduce that process time by using a map that > automatically gets populated instead of having to create it on the fly. > >> >> > > >> >> > Before - subjects_collection "attribute_mapped_collection": > >> >> > > ******************************************************************************************** > > > >> >> > for row in students: > >> >> > row_no += 1 > >> >> > > >> >> > for subject in row.subjects: > >> >> > student_subjects[subject.code] = subject.value > >> >> > > >> >> > csv_row = [row_no] > >> >> > csv_row += [student_subjects.get(x.code, '') for x in > ordered_subjects] > >> >> > csv_row += [row.created_on, row.updated_on] > >> >> > > >> >> > writer.writerow([x.encode('utf-8') if type(x) == unicode else x > for x in csv_row]) > >> >> > > >> >> > > >> >> > After adding the subjects_collection > "attribute_mapped_collection", I unfortunately did not see a change in > performance. > >> >> > > >> >> > After - subjects_collection "attribute_mapped_collection": > >> >> > > ******************************************************************************************** > > > >> >> > for row in students: > >> >> > row_no += 1 > >> >> > csv_row = [row_no] > >> >> > csv_row += [row.subjects_collection.get(x.code, '').value for x in > ordered_subjects] > >> >> > csv_row += [row.created_on, row.updated_on] > >> >> > > >> >> > writer.writerow([x.encode('utf-8') if type(x) == unicode else x > for x in csv_row]) > >> >> > > >> >> > > >> >> > class Subject(db.Model): > >> >> > __tablename__ = 'subjects' > >> >> > > >> >> > student_id = db.Column(db.Integer, db.ForeignKey('students.id'), > primary_key=True) > >> >> > > >> >> > code = db.Column(db.String(50), primary_key=True) > >> >> > > >> >> > value= db.Column(db.String) > >> >> > > >> >> > def __init__(self, code , value): > >> >> > self.code = code > >> >> > self.value = value > >> >> > > >> >> > > >> >> > class Student(ResourceMixin, db.Model): > >> >> > __tablename__ = 'students' > >> >> > > >> >> > subjects= db.relationship('Subject', backref='student') > >> >> > > >> >> > id = db.Column(db.Integer, primary_key=True) > >> >> > > >> >> > subjects_collection = relationship("Subject", > collection_class=attribute_mapped_collection('code')) > >> >> > > >> >> > Can you see a way I can optimize this? Any ideas? > >> >> > > >> >> > > >> >> > On Friday, 3 July 2020 12:31:03 UTC+2, Simon King wrote: > >> >> >> > >> >> >> Are you trying to optimise the database access (ie. minimize the > >> >> >> number of queries), or provide a nice dictionary-style API for > your > >> >> >> Student objects? What do you mean when you say that looping over > >> >> >> student.subjects is quite heavy? > >> >> >> > >> >> >> An association proxy can be used to get dict-style access to a > relationship: > >> >> >> > >> >> >> > https://docs.sqlalchemy.org/en/13/orm/extensions/associationproxy.html#proxying-to-dictionary-based-collections > > >> >> >> > >> >> >> There are also a couple of examples in the SQLAlchemy docs that > >> >> >> provide a dictionary-style API: > >> >> >> > >> >> >> > https://docs.sqlalchemy.org/en/13/orm/examples.html#module-examples.dynamic_dict > > >> >> >> > >> >> >> > https://docs.sqlalchemy.org/en/13/orm/examples.html#module-examples.vertical > >> >> >> > >> >> >> Hope that helps, > >> >> >> > >> >> >> Simon > >> >> >> > >> >> >> On Thu, Jul 2, 2020 at 8:46 PM Justvuur <just...@gmail.com> > wrote: > >> >> >> > > >> >> >> > Hi there, > >> >> >> > > >> >> >> > I'm struggling to find an efficient way to get a two columned > subset into dictionary form. > >> >> >> > > >> >> >> > I have an entity that has a subset of data. The subset is > linked to the entity via Id. The order of the subset of data is defined in > another table. > >> >> >> > > >> >> >> > Example: > >> >> >> > Student - Id, firstname, lastname > >> >> >> > Subjects - StudentId, SubjectCode, SubjectName > >> >> >> > > >> >> >> > At the moment I'm looping through the SqlAlchemy result of > "student.subjects" in python and creating a dictionary from that. It's > quite heavy, especially when there are 2000+ students with a potential of > 100+ subjects each. > >> >> >> > > >> >> >> > For each student, how do I get the subjects as a dictionary for > a student where the key is the SubjectCode and the value is the > SubjectName? > >> >> >> > Better yet, how can I get a result set: Id, firstname, lastname > SubjectCode x, SubjectCode y, etc etc (where the SubjectName becomes the > value and the SubjectCode becomes the column)? > >> >> >> > > >> >> >> > Regards, > >> >> >> > Justin > >> >> >> > > >> >> >> > -- > >> >> >> > SQLAlchemy - > >> >> >> > The Python SQL Toolkit and Object Relational Mapper > >> >> >> > > >> >> >> > http://www.sqlalchemy.org/ > >> >> >> > > >> >> >> > To post example code, please provide an MCVE: Minimal, > Complete, and Verifiable Example. See http://stackoverflow.com/help/mcve > for a full description. > >> >> >> > --- > >> >> >> > You received this message because you are subscribed to the > Google Groups "sqlalchemy" group. > >> >> >> > To unsubscribe from this group and stop receiving emails from > it, send an email to sqlal...@googlegroups.com. > >> >> >> > To view this discussion on the web visit > https://groups.google.com/d/msgid/sqlalchemy/57b74c9a-a6e5-494b-b468-d0bdcbcce60co%40googlegroups.com. > > > >> >> > > >> >> > -- > >> >> > SQLAlchemy - > >> >> > The Python SQL Toolkit and Object Relational Mapper > >> >> > > >> >> > http://www.sqlalchemy.org/ > >> >> > > >> >> > To post example code, please provide an MCVE: Minimal, Complete, > and Verifiable Example. See http://stackoverflow.com/help/mcve for a full > description. > >> >> > --- > >> >> > You received this message because you are subscribed to the Google > Groups "sqlalchemy" group. > >> >> > To unsubscribe from this group and stop receiving emails from it, > send an email to sqlal...@googlegroups.com. > >> >> > To view this discussion on the web visit > https://groups.google.com/d/msgid/sqlalchemy/67439e94-2666-402a-a642-7f4e7de8fbffo%40googlegroups.com. > > > >> > > >> > -- > >> > SQLAlchemy - > >> > The Python SQL Toolkit and Object Relational Mapper > >> > > >> > http://www.sqlalchemy.org/ > >> > > >> > To post example code, please provide an MCVE: Minimal, Complete, and > Verifiable Example. See http://stackoverflow.com/help/mcve for a full > description. > >> > --- > >> > You received this message because you are subscribed to the Google > Groups "sqlalchemy" group. > >> > To unsubscribe from this group and stop receiving emails from it, > send an email to sqlalchemy+...@googlegroups.com. > >> > To view this discussion on the web visit > https://groups.google.com/d/msgid/sqlalchemy/9c86dfc4-3666-454e-8b5e-2b9e2a32166ao%40googlegroups.com. > > > > > > -- > > SQLAlchemy - > > The Python SQL Toolkit and Object Relational Mapper > > > > http://www.sqlalchemy.org/ > > > > To post example code, please provide an MCVE: Minimal, Complete, and > Verifiable Example. See http://stackoverflow.com/help/mcve for a full > description. > > --- > > You received this message because you are subscribed to the Google > Groups "sqlalchemy" group. > > To unsubscribe from this group and stop receiving emails from it, send > an email to sqlal...@googlegroups.com <javascript:>. > > To view this discussion on the web visit > https://groups.google.com/d/msgid/sqlalchemy/29b184cc-9f5c-4fe2-a69d-4fd12651c88cn%40googlegroups.com. > > > -- SQLAlchemy - The Python SQL Toolkit and Object Relational Mapper http://www.sqlalchemy.org/ To post example code, please provide an MCVE: Minimal, Complete, and Verifiable Example. See http://stackoverflow.com/help/mcve for a full description. --- You received this message because you are subscribed to the Google Groups "sqlalchemy" group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/sqlalchemy/717c10af-95b3-4ec9-a139-e6f8dfd2923ao%40googlegroups.com.