Hey Simon,

I am indeed getting back duplicates of aggregate ids. I reworked my query 
to using exists like you suggested and now I get back correct result 
count.  Thanks so much for your suggestion and saving my day!  I can now 
move ahead on my work ticket.


Many, many, thanks again, appreciate it!!

Prerna

On Saturday, August 15, 2020 at 3:22:02 PM UTC-4, Simon King wrote:
>
> This is the same problem: you're writing a query that joins 3 tables 
> together, and then applying a "LIMIT 20" to that query. If you look 
> carefully at your 20 rows of psql output, I expect you'll see the same 
> aggregates_id appear more than once. There are less than 20 distinct 
> Aggregate objects. When SQLAlchemy receives those rows, it skips the 
> duplicates. 
>
> The SQL looks something like this: 
>
> SELECT ... FROM aggregates 
> LEFT JOIN aggregate_blocks ON ... 
> LEFT JOIN blocks ON ... 
> LIMIT 20 
>
> You need the LIMIT to apply just to the aggregates table, rather than 
> the joined tables. This is a bit tricky because you want to filter by 
> a field in one of those joined tables. You could use an EXISTS 
> function, something like this: 
>
> SELECT * FROM aggregates 
> WHERE EXISTS ( 
>   SELECT 1 
>   FROM aggregate_blocks 
>   INNER JOIN blocks ON aggregate_blocks.block_id = block.id 
>   WHERE aggregate_blocks.aggregate_id = aggregates.id 
>   AND blocks.is_complete = false 
> ) 
> LIMIT 20 
>
> If you run that in psql, I think you should get 20 *different* 
> aggregates rows back. 
>
> If that works, then to turn it into SQLAlchemy ORM syntax, you should 
> use Query.exists(): 
>
>
> https://docs.sqlalchemy.org/en/13/orm/query.html#sqlalchemy.orm.query.Query.exists
>  
>
> Hope that helps, 
>
> Simon 
>
> On Fri, Aug 14, 2020 at 3:56 PM Prerna Pandit <prerna...@gmail.com 
> <javascript:>> wrote: 
> > 
> > Hey Simon, 
> > 
> > Thanks so much for replying to my question.  I reworked my code to use 
> sqlalchemy ORM and took off flask and paginate so I can narrow down the 
> issue. My models now extend from declarative_base. 
> > 
> > engine = 
> create_engine('postgresql://postgres:postgres@localhost:5400/postgres') 
> > Session = sessionmaker(bind=engine) 
> > 
> > Base = declarative_base() 
> > session = Session() 
> > 
> > Models 
> > 
> > class Aggregate(Base): 
> >     __tablename__ = 'aggregates' 
> >     id = Column(UUID(as_uuid=True), primary_key=True, 
> >                    server_default=text('uuid_generate_v4()')) 
> >     site_id = Column(UUID(as_uuid=True), nullable=True) 
> >     created_at = Column(DateTime, default=sa.func.now()) 
> >     created_by = Column(UUID(as_uuid=True), nullable=True) 
> >     updated_at = Column(DateTime, default=sa.func.now(), 
> onupdate=sa.func.now()) 
> >     updated_by = Column(UUID(as_uuid=True), nullable=True) 
> >     blocks = relationship('AggregateBlock', cascade='all, 
> delete-orphan', 
> >                               passive_deletes=True, 
> back_populates='aggregate') 
> > 
> > 
> > class Block(Base): 
> > 
> >     __tablename__ = 'blocks' 
> >     id = Column(UUID(as_uuid=True), primary_key=True, 
> >                    server_default=text('uuid_generate_v4()')) 
> >     type = Column(Text, nullable=False) 
> >     heading = Column(Text, nullable=True) 
> >     subheading = Column(Text, nullable=True) 
> >     label = Column(Text, nullable=True) 
> >     is_complete = Column(Boolean, default=False) 
> >     created_at = Column(DateTime, default=sa.func.now()) 
> >     created_by = Column(UUID(as_uuid=True), nullable=True) 
> >     updated_at = Column(DateTime, default=sa.func.now(), 
> onupdate=sa.func.now()) 
> >     updated_by = Column(UUID(as_uuid=True), nullable=True) 
> >     aggregates = relationship('AggregateBlock', cascade='all, 
> delete-orphan', 
> >                                  passive_deletes=True, 
> back_populates='block') 
> > 
> > 
> > class AggregateBlock(Base): 
> >     __tablename__ = 'aggregate_blocks' 
> >     id = Column(UUID(as_uuid=True), primary_key=True, 
> >                    server_default=text('uuid_generate_v4()')) 
> >     block_id = Column(UUID(as_uuid=True), 
> >                          ForeignKey('blocks.id', ondelete='CASCADE'), 
> nullable=False, index=True) 
> >     aggregate_id = Column(UUID(as_uuid=True), 
> >                              ForeignKey('aggregates.id', 
> ondelete='RESTRICT'), nullable=False) 
> >     position = Column(Integer, nullable=False) 
> >     block = relationship('Block', back_populates='aggregates') 
> >     aggregate = relationship('Aggregate', back_populates='blocks') 
> > 
> > 
> > Query: 
> > 
> > select = session.query(Aggregate).order_by(Aggregate.created_at) \ 
> >             .join(AggregateBlock) \ 
> >             .join(Block) \ 
> >             .filter(Block.is_complete == complete) \ 
> > 
> > all_results = select.all() 
> > limit_results = select.limit(20).all() 
> > 
> > 
> > I still get inconsistent results when I apply limit. Like select.all() 
> will return 47 rows but  with limit it'll return anywhere between 11 to 15. 
>  If I take the generated SQL query and run it directly in psql, I get the 
> correct count. 
> > 
> > SELECT aggregates.id AS aggregates_id, aggregates.site_id AS 
> aggregates_site_id, aggregates.created_at AS aggregates_created_at, 
> aggregates.created_by AS aggregates_created_by, aggregates.updated_at AS 
> aggregates_updated_at, aggregates.updated_by AS aggregates_updated_by 
> > FROM aggregates JOIN aggregate_blocks ON aggregates.id = 
> aggregate_blocks.aggregate_id JOIN blocks ON blocks.id = 
> aggregate_blocks.block_id 
> > WHERE blocks.is_complete = false ORDER BY aggregates.created_at 
> > 
> >  LIMIT 20 
> > 
> > 
> > 
> > 
> > 
> > On Friday, August 14, 2020 at 10:08:23 AM UTC-4, Simon King wrote: 
> >> 
> >> "paginate" is not an SQLAlchemy function, so you'd be better off 
> >> asking the author of whatever is providing that feature. 
> >> 
> >> However, I would guess that maybe paginate is naively applying 
> >> something like "LIMIT 20" to the query. This doesn't work properly 
> >> when you join along a one-to-many relationship, because if you have 
> >> (for example) 2 "parent" objects, each with 5 "child" objects, the 
> >> query will return 10 rows, but SQLAlchemy de-duplicates the results to 
> >> return just the 2 parent objects. 
> >> 
> >> Simon 
> >> 
> >> On Thu, Aug 13, 2020 at 3:31 PM Prerna Pandit <prerna...@gmail.com> 
> wrote: 
> >> > 
> >> > Hello, I've been struggling with this issue for the past couple of 
> days and  would really, truly appreciate if someone could please give me 
> pointers or direction as to what I might be missing. 
> >> > 
> >> > 
> >> > Here are my models; 
> >> > class Aggregate(db.Model): 
> >> >     id = db.Column(UUID(as_uuid=True), primary_key=True, 
> >> >                    server_default=db.text('uuid_generate_v4()')) 
> >> >     blocks = db.relationship('AggregateBlock', cascade='all, 
> delete-orphan', 
> >> >                              passive_deletes=True, 
> back_populates='aggregate').. 
> >> > 
> >> > 
> >> > class AggregateBlock(db.Model): 
> >> > 
> >> >     id = db.Column(UUID(as_uuid=True), primary_key=True, 
> >> >                    server_default=db.text('uuid_generate_v4()')) 
> >> >     block_id = db.Column(UUID(as_uuid=True), 
> >> >                          db.ForeignKey('blocks.id', 
> ondelete='CASCADE'), nullable=False, index=True) 
> >> >     aggregate_id = db.Column(UUID(as_uuid=True), 
> >> >                              db.ForeignKey('aggregates.id', 
> ondelete='RESTRICT'), nullable=False) 
> >> >     block = db.relationship('Block', back_populates='aggregates') 
> >> >     aggregate = db.relationship('Aggregate', back_populates='blocks') 
> >> > 
> >> > 
> >> > 
> >> > 
> >> > class Block(db.Model): 
> >> >     id = db.Column(UUID(as_uuid=True), primary_key=True, 
> >> >                    server_default=db.text('uuid_generate_v4()')) 
> >> >     is_complete = db.Column(db.Boolean, default=False) 
> >> >     aggregates = db.relationship('AggregateBlock', cascade='all, 
> delete-orphan', 
> >> >                                  passive_deletes=True, 
> back_populates='block') 
> >> > 
> >> > 
> >> > from flask_sqlalchemy import SQLAlchemy 
> >> > db = SQLAlchemy() 
> >> > 
> >> > select = 
> db.session.query(Aggregate).join(AggregateBlock).join(Block).filter(Block.is_complete
>  
> == complete) 
> >> > 
> >> > print(len(select.all()) 
> >> > 
> >> > print(len(select.paginate(per_page=20).items()) 
> >> > 
> >> > 
> >> > If I do a select.all(), I get the right number of rows which is 47. 
>  However, if I try to paginate for a per_page size say 20, I lot a less 
> rows like 11. 
> >> > select.paginate(per_page=20). 
> >> > The number could go up to 21 or so as I increase the page size.  Why 
> would paginate decrease the number of returned records? 
> >> > 
> >> > -- 
> >> > SQLAlchemy - 
> >> > The Python SQL Toolkit and Object Relational Mapper 
> >> > 
> >> > http://www.sqlalchemy.org/ 
> >> > 
> >> > To post example code, please provide an MCVE: Minimal, Complete, and 
> Verifiable Example. See http://stackoverflow.com/help/mcve for a full 
> description. 
> >> > --- 
> >> > You received this message because you are subscribed to the Google 
> Groups "sqlalchemy" group. 
> >> > To unsubscribe from this group and stop receiving emails from it, 
> send an email to sqlal...@googlegroups.com. 
> >> > To view this discussion on the web visit 
> https://groups.google.com/d/msgid/sqlalchemy/e5a14305-4e63-4467-9610-1faf3f8c8412o%40googlegroups.com.
>  
>
> > 
> > -- 
> > SQLAlchemy - 
> > The Python SQL Toolkit and Object Relational Mapper 
> > 
> > http://www.sqlalchemy.org/ 
> > 
> > To post example code, please provide an MCVE: Minimal, Complete, and 
> Verifiable Example. See http://stackoverflow.com/help/mcve for a full 
> description. 
> > --- 
> > You received this message because you are subscribed to the Google 
> Groups "sqlalchemy" group. 
> > To unsubscribe from this group and stop receiving emails from it, send 
> an email to sqlal...@googlegroups.com <javascript:>. 
> > To view this discussion on the web visit 
> https://groups.google.com/d/msgid/sqlalchemy/d643cddb-3c9d-4ac9-8388-6746f584c6e6o%40googlegroups.com.
>  
>
>

-- 
SQLAlchemy - 
The Python SQL Toolkit and Object Relational Mapper

http://www.sqlalchemy.org/

To post example code, please provide an MCVE: Minimal, Complete, and Verifiable 
Example.  See  http://stackoverflow.com/help/mcve for a full description.
--- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/sqlalchemy/1133a9e0-d0f1-41ae-97e9-6d5c0e383c0do%40googlegroups.com.

Reply via email to