Re: [sqlalchemy] convert subset to dictionary

Justvuur Tue, 07 Jul 2020 04:53:38 -0700

I'm currently testing with 7000 students with 181 subjects.
I first went over to the DB to run the query directly on there. I did an 
innerjoin between the two tables.
It took about 2:25 minutes. I then selected specific columns from the two 
tables and the time dropped to about 2 minutes or so.
I then tried a "select subjects where student ID in (...)" query and that 
executed a lot quicker than the others.


So then on the python side of things, I first fetched the students and only 
selected id, created_on and updated_on columns using the with_entities 
option.
That seemed to take less than 1 second. I then, for each student, executed 
a DB query to fetch the subjects using the with_entities option.
For some reason, this seems to work faster than using the  
eager-loading/students.subjects/attribute_mapped_collection. They seem to 
take 2 minutes longer than the above (what should be inefficient) approach.

On Tuesday, 7 July 2020 at 12:50:13 UTC+2 Simon King wrote:

> How long is it taking? You mentioned 2000 students with 100 subjects
> each, so there are something like 200,000 rows in the Subjects table,
> and you need to load all of it. I wouldn't expect that to take longer
> than a couple of seconds though.
>
> Simon
>
> On Mon, Jul 6, 2020 at 7:34 PM Justvuur <just...@gmail.com> wrote:
> >
> > I added the eager-loading but it seems to slow down the SQL query quite 
> a lot.
> > It's as if now, the SQL query is taking longer but the generating of the 
> file is quicker hehe... I guess now the queries are being fired before to 
> populate subjects.
> > It's still taking relatively the same amount of time though.
> >
> >
> > On Friday, 3 July 2020 17:07:43 UTC+2, Simon King wrote:
> >>
> >> Are you eager-loading the "student.subjects" relationship? If not,
> >> that will give you the biggest performance increase. Without that, you
> >> will be issuing a separate DB query for each of the students, to load
> >> that student's subjects. Eager-loading allows you to preload the
> >> subjects for every student in a single query:
> >>
> >> 
> https://docs.sqlalchemy.org/en/13/orm/loading_relationships.html#joined-eager-loading
> >>
> >> Simon
> >>
> >> On Fri, Jul 3, 2020 at 1:36 PM Justvuur <just...@gmail.com> wrote:
> >> >
> >> > Hi Simon, thanks for the help! I've never used that before, it's 
> quite handy.
> >> >
> >> > I'm looping through all the students and printing them and their 
> subject details to a CSV file.
> >> > What makes things a tad complicated is the subjects must appear in a 
> specific order.
> >> > There is a table that has the subject code and order number 
> (ordered_subjects used below is the resultset from it).
> >> > I printed out the timing and found the problem to be with a nested 
> for loop.
> >> >
> >> > I was hoping to reduce that process time by using a map that 
> automatically gets populated instead of having to create it on the fly.
> >> >
> >> > Before - subjects_collection "attribute_mapped_collection":
> >> > 
> ********************************************************************************************
> >> > for row in students:
> >> > row_no += 1
> >> >
> >> > for subject in row.subjects:
> >> > student_subjects[subject.code] = subject.value
> >> >
> >> > csv_row = [row_no]
> >> > csv_row += [student_subjects.get(x.code, '') for x in 
> ordered_subjects]
> >> > csv_row += [row.created_on, row.updated_on]
> >> >
> >> > writer.writerow([x.encode('utf-8') if type(x) == unicode else x for x 
> in csv_row])
> >> >
> >> >
> >> > After adding the subjects_collection "attribute_mapped_collection", I 
> unfortunately did not see a change in performance.
> >> >
> >> > After - subjects_collection "attribute_mapped_collection":
> >> > 
> ********************************************************************************************
> >> > for row in students:
> >> > row_no += 1
> >> > csv_row = [row_no]
> >> > csv_row += [row.subjects_collection.get(x.code, '').value for x in 
> ordered_subjects]
> >> > csv_row += [row.created_on, row.updated_on]
> >> >
> >> > writer.writerow([x.encode('utf-8') if type(x) == unicode else x for x 
> in csv_row])
> >> >
> >> >
> >> > class Subject(db.Model):
> >> > __tablename__ = 'subjects'
> >> >
> >> > student_id = db.Column(db.Integer, db.ForeignKey('students.id'), 
> primary_key=True)
> >> >
> >> > code = db.Column(db.String(50), primary_key=True)
> >> >
> >> > value= db.Column(db.String)
> >> >
> >> > def __init__(self, code , value):
> >> > self.code = code
> >> > self.value = value
> >> >
> >> >
> >> > class Student(ResourceMixin, db.Model):
> >> > __tablename__ = 'students'
> >> >
> >> > subjects= db.relationship('Subject', backref='student')
> >> >
> >> > id = db.Column(db.Integer, primary_key=True)
> >> >
> >> > subjects_collection = relationship("Subject", 
> collection_class=attribute_mapped_collection('code'))
> >> >
> >> > Can you see a way I can optimize this? Any ideas?
> >> >
> >> >
> >> > On Friday, 3 July 2020 12:31:03 UTC+2, Simon King wrote:
> >> >>
> >> >> Are you trying to optimise the database access (ie. minimize the
> >> >> number of queries), or provide a nice dictionary-style API for your
> >> >> Student objects? What do you mean when you say that looping over
> >> >> student.subjects is quite heavy?
> >> >>
> >> >> An association proxy can be used to get dict-style access to a 
> relationship:
> >> >>
> >> >> 
> https://docs.sqlalchemy.org/en/13/orm/extensions/associationproxy.html#proxying-to-dictionary-based-collections
> >> >>
> >> >> There are also a couple of examples in the SQLAlchemy docs that
> >> >> provide a dictionary-style API:
> >> >>
> >> >> 
> https://docs.sqlalchemy.org/en/13/orm/examples.html#module-examples.dynamic_dict
> >> >>
> >> >> 
> https://docs.sqlalchemy.org/en/13/orm/examples.html#module-examples.vertical
> >> >>
> >> >> Hope that helps,
> >> >>
> >> >> Simon
> >> >>
> >> >> On Thu, Jul 2, 2020 at 8:46 PM Justvuur <just...@gmail.com> wrote:
> >> >> >
> >> >> > Hi there,
> >> >> >
> >> >> > I'm struggling to find an efficient way to get a two columned 
> subset into dictionary form.
> >> >> >
> >> >> > I have an entity that has a subset of data. The subset is linked 
> to the entity via Id. The order of the subset of data is defined in another 
> table.
> >> >> >
> >> >> > Example:
> >> >> > Student - Id, firstname, lastname
> >> >> > Subjects - StudentId, SubjectCode, SubjectName
> >> >> >
> >> >> > At the moment I'm looping through the SqlAlchemy result of 
> "student.subjects" in python and creating a dictionary from that. It's 
> quite heavy, especially when there are 2000+ students with a potential of 
> 100+ subjects each.
> >> >> >
> >> >> > For each student, how do I get the subjects as a dictionary for a 
> student where the key is the SubjectCode and the value is the SubjectName?
> >> >> > Better yet, how can I get a result set: Id, firstname, lastname 
> SubjectCode x, SubjectCode y, etc etc (where the SubjectName becomes the 
> value and the SubjectCode becomes the column)?
> >> >> >
> >> >> > Regards,
> >> >> > Justin
> >> >> >
> >> >> > --
> >> >> > SQLAlchemy -
> >> >> > The Python SQL Toolkit and Object Relational Mapper
> >> >> >
> >> >> > http://www.sqlalchemy.org/
> >> >> >
> >> >> > To post example code, please provide an MCVE: Minimal, Complete, 
> and Verifiable Example. See http://stackoverflow.com/help/mcve for a full 
> description.
> >> >> > ---
> >> >> > You received this message because you are subscribed to the Google 
> Groups "sqlalchemy" group.
> >> >> > To unsubscribe from this group and stop receiving emails from it, 
> send an email to sqlal...@googlegroups.com.
> >> >> > To view this discussion on the web visit 
> https://groups.google.com/d/msgid/sqlalchemy/57b74c9a-a6e5-494b-b468-d0bdcbcce60co%40googlegroups.com
> .
> >> >
> >> > --
> >> > SQLAlchemy -
> >> > The Python SQL Toolkit and Object Relational Mapper
> >> >
> >> > http://www.sqlalchemy.org/
> >> >
> >> > To post example code, please provide an MCVE: Minimal, Complete, and 
> Verifiable Example. See http://stackoverflow.com/help/mcve for a full 
> description.
> >> > ---
> >> > You received this message because you are subscribed to the Google 
> Groups "sqlalchemy" group.
> >> > To unsubscribe from this group and stop receiving emails from it, 
> send an email to sqlal...@googlegroups.com.
> >> > To view this discussion on the web visit 
> https://groups.google.com/d/msgid/sqlalchemy/67439e94-2666-402a-a642-7f4e7de8fbffo%40googlegroups.com
> .
> >
> > --
> > SQLAlchemy -
> > The Python SQL Toolkit and Object Relational Mapper
> >
> > http://www.sqlalchemy.org/
> >
> > To post example code, please provide an MCVE: Minimal, Complete, and 
> Verifiable Example. See http://stackoverflow.com/help/mcve for a full 
> description.
> > ---
> > You received this message because you are subscribed to the Google 
> Groups "sqlalchemy" group.
> > To unsubscribe from this group and stop receiving emails from it, send 
> an email to sqlalchemy+...@googlegroups.com.
> > To view this discussion on the web visit 
> https://groups.google.com/d/msgid/sqlalchemy/9c86dfc4-3666-454e-8b5e-2b9e2a32166ao%40googlegroups.com
> .
>

-- 
SQLAlchemy - 
The Python SQL Toolkit and Object Relational Mapper

http://www.sqlalchemy.org/

To post example code, please provide an MCVE: Minimal, Complete, and Verifiable 
Example.  See  http://stackoverflow.com/help/mcve for a full description.
--- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/sqlalchemy/29b184cc-9f5c-4fe2-a69d-4fd12651c88cn%40googlegroups.com.

Re: [sqlalchemy] convert subset to dictionary

Reply via email to