Re: [sqlalchemy] Flask SQlAlchemy BaseQuery Paginate not working correctly

2020-08-17 Thread Prerna Pandit
Hey Simon,

I am indeed getting back duplicates of aggregate ids. I reworked my query 
to using exists like you suggested and now I get back correct result 
count.  Thanks so much for your suggestion and saving my day!  I can now 
move ahead on my work ticket.


Many, many, thanks again, appreciate it!!

Prerna

On Saturday, August 15, 2020 at 3:22:02 PM UTC-4, Simon King wrote:
>
> This is the same problem: you're writing a query that joins 3 tables 
> together, and then applying a "LIMIT 20" to that query. If you look 
> carefully at your 20 rows of psql output, I expect you'll see the same 
> aggregates_id appear more than once. There are less than 20 distinct 
> Aggregate objects. When SQLAlchemy receives those rows, it skips the 
> duplicates. 
>
> The SQL looks something like this: 
>
> SELECT ... FROM aggregates 
> LEFT JOIN aggregate_blocks ON ... 
> LEFT JOIN blocks ON ... 
> LIMIT 20 
>
> You need the LIMIT to apply just to the aggregates table, rather than 
> the joined tables. This is a bit tricky because you want to filter by 
> a field in one of those joined tables. You could use an EXISTS 
> function, something like this: 
>
> SELECT * FROM aggregates 
> WHERE EXISTS ( 
>   SELECT 1 
>   FROM aggregate_blocks 
>   INNER JOIN blocks ON aggregate_blocks.block_id = block.id 
>   WHERE aggregate_blocks.aggregate_id = aggregates.id 
>   AND blocks.is_complete = false 
> ) 
> LIMIT 20 
>
> If you run that in psql, I think you should get 20 *different* 
> aggregates rows back. 
>
> If that works, then to turn it into SQLAlchemy ORM syntax, you should 
> use Query.exists(): 
>
>
> https://docs.sqlalchemy.org/en/13/orm/query.html#sqlalchemy.orm.query.Query.exists
>  
>
> Hope that helps, 
>
> Simon 
>
> On Fri, Aug 14, 2020 at 3:56 PM Prerna Pandit  > wrote: 
> > 
> > Hey Simon, 
> > 
> > Thanks so much for replying to my question.  I reworked my code to use 
> sqlalchemy ORM and took off flask and paginate so I can narrow down the 
> issue. My models now extend from declarative_base. 
> > 
> > engine = 
> create_engine('postgresql://postgres:postgres@localhost:5400/postgres') 
> > Session = sessionmaker(bind=engine) 
> > 
> > Base = declarative_base() 
> > session = Session() 
> > 
> > Models 
> > 
> > class Aggregate(Base): 
> > __tablename__ = 'aggregates' 
> > id = Column(UUID(as_uuid=True), primary_key=True, 
> >server_default=text('uuid_generate_v4()')) 
> > site_id = Column(UUID(as_uuid=True), nullable=True) 
> > created_at = Column(DateTime, default=sa.func.now()) 
> > created_by = Column(UUID(as_uuid=True), nullable=True) 
> > updated_at = Column(DateTime, default=sa.func.now(), 
> onupdate=sa.func.now()) 
> > updated_by = Column(UUID(as_uuid=True), nullable=True) 
> > blocks = relationship('AggregateBlock', cascade='all, 
> delete-orphan', 
> >   passive_deletes=True, 
> back_populates='aggregate') 
> > 
> > 
> > class Block(Base): 
> > 
> > __tablename__ = 'blocks' 
> > id = Column(UUID(as_uuid=True), primary_key=True, 
> >server_default=text('uuid_generate_v4()')) 
> > type = Column(Text, nullable=False) 
> > heading = Column(Text, nullable=True) 
> > subheading = Column(Text, nullable=True) 
> > label = Column(Text, nullable=True) 
> > is_complete = Column(Boolean, default=False) 
> > created_at = Column(DateTime, default=sa.func.now()) 
> > created_by = Column(UUID(as_uuid=True), nullable=True) 
> > updated_at = Column(DateTime, default=sa.func.now(), 
> onupdate=sa.func.now()) 
> > updated_by = Column(UUID(as_uuid=True), nullable=True) 
> > aggregates = relationship('AggregateBlock', cascade='all, 
> delete-orphan', 
> >  passive_deletes=True, 
> back_populates='block') 
> > 
> > 
> > class AggregateBlock(Base): 
> > __tablename__ = 'aggregate_blocks' 
> > id = Column(UUID(as_uuid=True), primary_key=True, 
> >server_default=text('uuid_generate_v4()')) 
> > block_id = Column(UUID(as_uuid=True), 
> >  ForeignKey('blocks.id', ondelete='CASCADE'), 
> nullable=False, index=True) 
> > aggregate_id = Column(UUID(as_uuid=True), 
> >  ForeignKey('aggregates.id', 
> ondelete='RESTRICT'), nullable=False) 
> > position = Column(Integer, nullable=False) 
> > block = relationship('Block', back_populates='aggregates') 
> > aggregate = relationship('Aggregate', back_populates='blocks') 
> > 
> > 
> > Query: 
> > 
> > select = session.query(Aggregate).order_by(Aggregate.created_at) \ 
> > .join(AggregateBlock) \ 
> > .join(Block) \ 
> > .filter(Block.is_complete == complete) \ 
> > 
> > all_results = select.all() 
> > limit_results = select.limit(20).all() 
> > 
> > 
> > I still get inconsistent results when I apply limit. Like select.all() 
> will return 47 rows but  with limit it'll return 

Re: [sqlalchemy] Flask SQlAlchemy BaseQuery Paginate not working correctly

2020-08-15 Thread Simon King
This is the same problem: you're writing a query that joins 3 tables
together, and then applying a "LIMIT 20" to that query. If you look
carefully at your 20 rows of psql output, I expect you'll see the same
aggregates_id appear more than once. There are less than 20 distinct
Aggregate objects. When SQLAlchemy receives those rows, it skips the
duplicates.

The SQL looks something like this:

SELECT ... FROM aggregates
LEFT JOIN aggregate_blocks ON ...
LEFT JOIN blocks ON ...
LIMIT 20

You need the LIMIT to apply just to the aggregates table, rather than
the joined tables. This is a bit tricky because you want to filter by
a field in one of those joined tables. You could use an EXISTS
function, something like this:

SELECT * FROM aggregates
WHERE EXISTS (
  SELECT 1
  FROM aggregate_blocks
  INNER JOIN blocks ON aggregate_blocks.block_id = block.id
  WHERE aggregate_blocks.aggregate_id = aggregates.id
  AND blocks.is_complete = false
)
LIMIT 20

If you run that in psql, I think you should get 20 *different*
aggregates rows back.

If that works, then to turn it into SQLAlchemy ORM syntax, you should
use Query.exists():

https://docs.sqlalchemy.org/en/13/orm/query.html#sqlalchemy.orm.query.Query.exists

Hope that helps,

Simon

On Fri, Aug 14, 2020 at 3:56 PM Prerna Pandit  wrote:
>
> Hey Simon,
>
> Thanks so much for replying to my question.  I reworked my code to use 
> sqlalchemy ORM and took off flask and paginate so I can narrow down the 
> issue. My models now extend from declarative_base.
>
> engine = 
> create_engine('postgresql://postgres:postgres@localhost:5400/postgres')
> Session = sessionmaker(bind=engine)
>
> Base = declarative_base()
> session = Session()
>
> Models
>
> class Aggregate(Base):
> __tablename__ = 'aggregates'
> id = Column(UUID(as_uuid=True), primary_key=True,
>server_default=text('uuid_generate_v4()'))
> site_id = Column(UUID(as_uuid=True), nullable=True)
> created_at = Column(DateTime, default=sa.func.now())
> created_by = Column(UUID(as_uuid=True), nullable=True)
> updated_at = Column(DateTime, default=sa.func.now(), 
> onupdate=sa.func.now())
> updated_by = Column(UUID(as_uuid=True), nullable=True)
> blocks = relationship('AggregateBlock', cascade='all, delete-orphan',
>   passive_deletes=True, 
> back_populates='aggregate')
>
>
> class Block(Base):
>
> __tablename__ = 'blocks'
> id = Column(UUID(as_uuid=True), primary_key=True,
>server_default=text('uuid_generate_v4()'))
> type = Column(Text, nullable=False)
> heading = Column(Text, nullable=True)
> subheading = Column(Text, nullable=True)
> label = Column(Text, nullable=True)
> is_complete = Column(Boolean, default=False)
> created_at = Column(DateTime, default=sa.func.now())
> created_by = Column(UUID(as_uuid=True), nullable=True)
> updated_at = Column(DateTime, default=sa.func.now(), 
> onupdate=sa.func.now())
> updated_by = Column(UUID(as_uuid=True), nullable=True)
> aggregates = relationship('AggregateBlock', cascade='all, delete-orphan',
>  passive_deletes=True, back_populates='block')
>
>
> class AggregateBlock(Base):
> __tablename__ = 'aggregate_blocks'
> id = Column(UUID(as_uuid=True), primary_key=True,
>server_default=text('uuid_generate_v4()'))
> block_id = Column(UUID(as_uuid=True),
>  ForeignKey('blocks.id', ondelete='CASCADE'), 
> nullable=False, index=True)
> aggregate_id = Column(UUID(as_uuid=True),
>  ForeignKey('aggregates.id', 
> ondelete='RESTRICT'), nullable=False)
> position = Column(Integer, nullable=False)
> block = relationship('Block', back_populates='aggregates')
> aggregate = relationship('Aggregate', back_populates='blocks')
>
>
> Query:
>
> select = session.query(Aggregate).order_by(Aggregate.created_at) \
> .join(AggregateBlock) \
> .join(Block) \
> .filter(Block.is_complete == complete) \
>
> all_results = select.all()
> limit_results = select.limit(20).all()
>
>
> I still get inconsistent results when I apply limit. Like select.all() will 
> return 47 rows but  with limit it'll return anywhere between 11 to 15.  If I 
> take the generated SQL query and run it directly in psql, I get the correct 
> count.
>
> SELECT aggregates.id AS aggregates_id, aggregates.site_id AS 
> aggregates_site_id, aggregates.created_at AS aggregates_created_at, 
> aggregates.created_by AS aggregates_created_by, aggregates.updated_at AS 
> aggregates_updated_at, aggregates.updated_by AS aggregates_updated_by
> FROM aggregates JOIN aggregate_blocks ON aggregates.id = 
> aggregate_blocks.aggregate_id JOIN blocks ON blocks.id = 
> aggregate_blocks.block_id
> WHERE blocks.is_complete = false ORDER BY aggregates.created_at
>
>  LIMIT 20
>
>
>
>
>
> On Friday, August 14, 2020 at 10:08:23 AM UTC-4, Simon King wrote

Re: [sqlalchemy] Flask SQlAlchemy BaseQuery Paginate not working correctly

2020-08-14 Thread Prerna Pandit
Hey Simon,

Thanks so much for replying to my question.  I reworked my code to use 
sqlalchemy ORM and took off flask and paginate so I can narrow down the 
issue. My models now extend from declarative_base.

engine = 
create_engine('postgresql://postgres:postgres@localhost:5400/postgres')
Session = sessionmaker(bind=engine)

Base = declarative_base()
session = Session()

Models

class Aggregate(Base):
__tablename__ = 'aggregates'
id = Column(UUID(as_uuid=True), primary_key=True,
   server_default=text('uuid_generate_v4()'))
site_id = Column(UUID(as_uuid=True), nullable=True)
created_at = Column(DateTime, default=sa.func.now())
created_by = Column(UUID(as_uuid=True), nullable=True)
updated_at = Column(DateTime, default=sa.func.now(), 
onupdate=sa.func.now())
updated_by = Column(UUID(as_uuid=True), nullable=True)
blocks = relationship('AggregateBlock', cascade='all, delete-orphan',
  passive_deletes=True, 
back_populates='aggregate')
  

class Block(Base):

__tablename__ = 'blocks'
id = Column(UUID(as_uuid=True), primary_key=True,
   server_default=text('uuid_generate_v4()'))
type = Column(Text, nullable=False)
heading = Column(Text, nullable=True)
subheading = Column(Text, nullable=True)
label = Column(Text, nullable=True)
is_complete = Column(Boolean, default=False)
created_at = Column(DateTime, default=sa.func.now())
created_by = Column(UUID(as_uuid=True), nullable=True)
updated_at = Column(DateTime, default=sa.func.now(), 
onupdate=sa.func.now())
updated_by = Column(UUID(as_uuid=True), nullable=True)
aggregates = relationship('AggregateBlock', cascade='all, 
delete-orphan',
 passive_deletes=True, 
back_populates='block')


class AggregateBlock(Base):
__tablename__ = 'aggregate_blocks'
id = Column(UUID(as_uuid=True), primary_key=True,
   server_default=text('uuid_generate_v4()'))
block_id = Column(UUID(as_uuid=True),
 ForeignKey('blocks.id', ondelete='CASCADE'), 
nullable=False, index=True)
aggregate_id = Column(UUID(as_uuid=True),
 ForeignKey('aggregates.id', 
ondelete='RESTRICT'), nullable=False)
position = Column(Integer, nullable=False)
block = relationship('Block', back_populates='aggregates')
aggregate = relationship('Aggregate', back_populates='blocks')


Query:

select = session.query(Aggregate).order_by(Aggregate.created_at) \
.join(AggregateBlock) \
.join(Block) \
.filter(Block.is_complete == complete) \

all_results = select.all()
limit_results = select.limit(20).all()


I still get inconsistent results when I apply limit. Like select.all() will 
return 47 rows but  with limit it'll return anywhere between 11 to 15.  If 
I take the generated SQL query and run it directly in psql, I get the 
correct count.

SELECT aggregates.id AS aggregates_id, aggregates.site_id AS 
aggregates_site_id, aggregates.created_at AS aggregates_created_at, 
aggregates.created_by AS aggregates_created_by, aggregates.updated_at AS 
aggregates_updated_at, aggregates.updated_by AS aggregates_updated_by
FROM aggregates JOIN aggregate_blocks ON aggregates.id = 
aggregate_blocks.aggregate_id JOIN blocks ON blocks.id = 
aggregate_blocks.block_id
WHERE blocks.is_complete = false ORDER BY aggregates.created_at

 LIMIT 20





On Friday, August 14, 2020 at 10:08:23 AM UTC-4, Simon King wrote:
>
> "paginate" is not an SQLAlchemy function, so you'd be better off 
> asking the author of whatever is providing that feature. 
>
> However, I would guess that maybe paginate is naively applying 
> something like "LIMIT 20" to the query. This doesn't work properly 
> when you join along a one-to-many relationship, because if you have 
> (for example) 2 "parent" objects, each with 5 "child" objects, the 
> query will return 10 rows, but SQLAlchemy de-duplicates the results to 
> return just the 2 parent objects. 
>
> Simon 
>
> On Thu, Aug 13, 2020 at 3:31 PM Prerna Pandit  > wrote: 
> > 
> > Hello, I've been struggling with this issue for the past couple of days 
> and  would really, truly appreciate if someone could please give me 
> pointers or direction as to what I might be missing. 
> > 
> > 
> > Here are my models; 
> > class Aggregate(db.Model): 
> > id = db.Column(UUID(as_uuid=True), primary_key=True, 
> >server_default=db.text('uuid_generate_v4()')) 
> > blocks = db.relationship('AggregateBlock', cascade='all, 
> delete-orphan', 
> >  passive_deletes=True, 
> back_populates='aggregate').. 
> > 
> > 
> > class AggregateBlock(db.Model): 
> > 
> > id = db.Column(UUID(as_uuid=True), primary_key=True, 
> >server_default=db.text('uuid_generate_v4()')) 
> > block_id = db.Column(UUID(as_uuid=True), 
> > 

Re: [sqlalchemy] Flask SQlAlchemy BaseQuery Paginate not working correctly

2020-08-14 Thread Simon King
"paginate" is not an SQLAlchemy function, so you'd be better off
asking the author of whatever is providing that feature.

However, I would guess that maybe paginate is naively applying
something like "LIMIT 20" to the query. This doesn't work properly
when you join along a one-to-many relationship, because if you have
(for example) 2 "parent" objects, each with 5 "child" objects, the
query will return 10 rows, but SQLAlchemy de-duplicates the results to
return just the 2 parent objects.

Simon

On Thu, Aug 13, 2020 at 3:31 PM Prerna Pandit  wrote:
>
> Hello, I've been struggling with this issue for the past couple of days and  
> would really, truly appreciate if someone could please give me pointers or 
> direction as to what I might be missing.
>
>
> Here are my models;
> class Aggregate(db.Model):
> id = db.Column(UUID(as_uuid=True), primary_key=True,
>server_default=db.text('uuid_generate_v4()'))
> blocks = db.relationship('AggregateBlock', cascade='all, delete-orphan',
>  passive_deletes=True, 
> back_populates='aggregate')..
>
>
> class AggregateBlock(db.Model):
>
> id = db.Column(UUID(as_uuid=True), primary_key=True,
>server_default=db.text('uuid_generate_v4()'))
> block_id = db.Column(UUID(as_uuid=True),
>  db.ForeignKey('blocks.id', ondelete='CASCADE'), 
> nullable=False, index=True)
> aggregate_id = db.Column(UUID(as_uuid=True),
>  db.ForeignKey('aggregates.id', 
> ondelete='RESTRICT'), nullable=False)
> block = db.relationship('Block', back_populates='aggregates')
> aggregate = db.relationship('Aggregate', back_populates='blocks')
>
>
>
>
> class Block(db.Model):
> id = db.Column(UUID(as_uuid=True), primary_key=True,
>server_default=db.text('uuid_generate_v4()'))
> is_complete = db.Column(db.Boolean, default=False)
> aggregates = db.relationship('AggregateBlock', cascade='all, 
> delete-orphan',
>  passive_deletes=True, back_populates='block')
>
>
> from flask_sqlalchemy import SQLAlchemy
> db = SQLAlchemy()
>
> select = 
> db.session.query(Aggregate).join(AggregateBlock).join(Block).filter(Block.is_complete
>  == complete)
>
> print(len(select.all())
>
> print(len(select.paginate(per_page=20).items())
>
>
> If I do a select.all(), I get the right number of rows which is 47.  However, 
> if I try to paginate for a per_page size say 20, I lot a less rows like 11.
> select.paginate(per_page=20).
> The number could go up to 21 or so as I increase the page size.  Why would 
> paginate decrease the number of returned records?
>
> --
> SQLAlchemy -
> The Python SQL Toolkit and Object Relational Mapper
>
> http://www.sqlalchemy.org/
>
> To post example code, please provide an MCVE: Minimal, Complete, and 
> Verifiable Example. See http://stackoverflow.com/help/mcve for a full 
> description.
> ---
> You received this message because you are subscribed to the Google Groups 
> "sqlalchemy" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to sqlalchemy+unsubscr...@googlegroups.com.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/sqlalchemy/e5a14305-4e63-4467-9610-1faf3f8c8412o%40googlegroups.com.

-- 
SQLAlchemy - 
The Python SQL Toolkit and Object Relational Mapper

http://www.sqlalchemy.org/

To post example code, please provide an MCVE: Minimal, Complete, and Verifiable 
Example.  See  http://stackoverflow.com/help/mcve for a full description.
--- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/sqlalchemy/CAFHwexeg0K1URZZbi4oaQHtLFkCVbE3Fn2%2BY%3DDaaZymUp9S-qg%40mail.gmail.com.