Thank you very much for your answers! Some more questions 1) I see you use sa.inspect(self).dict in __getstate__. Is it safe to replace it with self.__dict__? I didn't find any difference in the result, but __dict__ works 30 times faster. 2) I'm going to implement __getstate__ and __setstate__ in a Base class. Is it a risky idea from your point of view? 3) We've greatly optimized the unpickling for read-only entities. Is it possible to do the same during loading from the database? When I read data, I don't need ORM state, because I'm not going to modify the data and don't want lazy loads.
On Friday, 11 October 2019 00:34:03 UTC+7, Mike Bayer wrote: > > > > On Thu, Oct 10, 2019, at 2:48 AM, Александр Егоров wrote: > > Hello! > > I want to cache query results in a distributed cache (Redis), so I need to > serialize/deserialize fetched entities very fast. > However, it turned that SQLAlchemy entities are very heavy for pickle to > dump/load. Regular classes and dicts with the same structure takes > significantly less amount of time to reconstruct from bytes. > It's even slower than fetching and mapping the entities from a database. > > Here are benchmark results, loading 10000 simple entities: > > *class *Entity(Base): > __tablename__ = *'test_entity'* > id = sa.Column(sa.Integer, primary_key=*True*, autoincrement=*True*) > field1 = sa.Column(sa.String) > field2 = sa.Column(sa.Integer) > > > > Name (time in ms) Median StdDev > Dump bytes > -------------------------------------------------------------------------------------------------------- > test_unpickle_dicts 2.3764 (1.0) 0.0797 (1.09) > 337806 > test_unpickle_dataclasses 9.2546 (3.89) 0.0734 (1.0) > 437861 > test_unpickle_classes 9.4188 (3.96) 0.1230 (1.68) > 437862 > test_load_from_database 91.3482 (38.44) 0.6686 (9.10) > test_unpickle_unsaved_entities 108.6726 (45.73) 0.5154 (7.02) > 1448169 > test_unpickle_persisted_entities 166.4531 (70.04) 0.3787 (5.16) > 1697849 > --------------------------------------------------------------------------------------------------------- > > > *Environment*: > OS: Ubuntu 16.04 > Python: CPython 3.6.3 > SQLAlchemy: 1.3.0 > Database: in-memory SQLite, PostgreSQL (all gives similar results) > The full benchmark code is in a single attachment file. > To run: > pytest test_ser.py --benchmark-columns=median,stddev --benchmark-disable- > gc > Requirements: > pip install sqlalchemy==1.3.0 pytest==3.6.0 pytest-benchmark==3.1.1 > dataclasses==0.6 > > *Questions:* > 1) What strategy for caching I should consider while using SQLAlchemy? > Currently, the only option I see is to have a duplicated declaration of > entities in a form of simple classes and use it when I don't need > modification. Needles to say, it's a lot of code duplication. > 2) Is it possible to have a read-only loader in SQLAlchemy? I can imagine > mapping loaded data to an entity class but without instrumentation overhead > which is not needed to read the data. > If continue the idea further, I can imagine a situation when I load the > entity for modification, and it has a read-only relationship to another > entity. So that I don't pay for things I don't use. > > Probably I'm making some crazy things and all wrong, feel free to point me > out. My ultimate goal is to manage caching with SQLAlchemy and understand > SQLAlchemy more in this direction. > Thank you in advance! > > > Hi there - > > there's no simple answer for this, have you considered writing your own > serializer for your objects rather than using pickle's default strategy? > that is, if you implement __getstate__ and __setstate__, you can allow the > serialization to not include the "_sa_instance_state" that's associated > with the objects which is where you'd get some of the slowness on dumping / > loading, it also would save lots of memory / network overhead by not > storing lots of data that you can probably forego for storing in your cache. > > For bringing the classes back, you can implement a setstate that returns > them as non-ORM classes, or patches them in as ORM classes. > > I've altered your test to include only ORM mapped classes, comparing your > baseline to one that uses __getstate__, and then several variants which > will grant more or less ORM functionality once you have unpickled them. > the very fastest one which omits any ORM functionality at all, and looks > like what you propose, has a mean of 26 ms compared to 380 ms for the most > expensive. > > there's one ORM-level guarantee you would want me to add for this most > optimized version, that is if you have an ORM object that has no > _sa_instance_state at all, you can still access attributes on it. This > works but I would want to add test support so that I keep this as a > supported use case. > > > > > > > > > > -- > SQLAlchemy - > The Python SQL Toolkit and Object Relational Mapper > > http://www.sqlalchemy.org/ > > To post example code, please provide an MCVE: Minimal, Complete, and > Verifiable Example. See http://stackoverflow.com/help/mcve for a full > description. > --- > You received this message because you are subscribed to the Google Groups > "sqlalchemy" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to sqlal...@googlegroups.com <javascript:>. > To view this discussion on the web visit > https://groups.google.com/d/msgid/sqlalchemy/75aa92b7-3c94-4b5a-afed-062a203792fd%40googlegroups.com > > <https://groups.google.com/d/msgid/sqlalchemy/75aa92b7-3c94-4b5a-afed-062a203792fd%40googlegroups.com?utm_medium=email&utm_source=footer> > . > > > *Attachments:* > > - test_sqla_pickle.py > > > -- SQLAlchemy - The Python SQL Toolkit and Object Relational Mapper http://www.sqlalchemy.org/ To post example code, please provide an MCVE: Minimal, Complete, and Verifiable Example. See http://stackoverflow.com/help/mcve for a full description. --- You received this message because you are subscribed to the Google Groups "sqlalchemy" group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/sqlalchemy/e6b0c770-21c4-4672-b886-4a75e97f94f2%40googlegroups.com.