Re: Delay between consecutive SQL statements when using Alembic on Jenkins

2020-08-17 Thread Mike Bayer


On Sun, Aug 16, 2020, at 11:10 PM, MichaƂ Guzek wrote:
> I have a problem with delays between consecutive SQL statements when Alembic 
> executes a migration script's upgrade() function on Jenkins:

> def upgrade():
> op.execute("DELETE FROM employee WHERE name='John';") #John also has its 
> primary key ID field equal to 1
> op.execute("INSERT INTO employee (id,name,occupation) VALUES (1, 'Michael', 
> 'Bartender');")
> Basically the second statement can't run to completion because I'm getting an 
> error that ID column would contain duplicate values. Apparently, there is 
> some sort of delay between the first and second op.execute() statement and 
> when the second one is executed, the DB still contains the old entry for John 
> whose ID is also 1.


there's no inherent "delay" between statements.   this would sound like you 
have something going on in the background, such as special triggers and/or 
replication that has some kind of lag.   What else can you learn from looking 
at the (not shown here) error message you're getting?  is the table name what 
you expect?  does the stack trace make sense?   

> I use SQLAlchemy's create_engine() to initiate a DB connection to MySQL in 
> Alembic's run_migration_online() function in its env.py configuration file. I 
> don't experience any such delays or errors on my local machine where I use 
> the exact same codebase and MySQL as well. Also, the problem on Jenkins is 
> intermittent, sometimes the build succeeds when I just hit rebuild.


What else can you say about the MySQL database in use?   what's the exact 
version in use ?  mariadb or mysql?   is it galera?  is there a proxy server or 
scaling server in the middle?  what storage engine is used for these tables ?  
(innodb?)  does the issue happen only when there's lots of load on the machine? 
  was there some configurational change that precipated this failure beginning 
to happen?

is this jenkins running on a VM or container, does it get built out to 
arbitrary servers? is it running anywhere else without a problem?  

> Do you know what might be the cause of those weird delays on Jenkins?

> Previously I put the DELETE statement in the downgrade() function and run it 
> first by using alembic downgrade -1 and subsequently alembic upgrade head. 
> But I had the same issue, it's as if alembic downgrade updated 
> alembic_version table correctly on time but wasn't able to update my employee 
> table on time, before upgrade() has started... And it also happened only on 
> Jenkins.

> 

> --
> You received this message because you are subscribed to the Google Groups 
> "sqlalchemy-alembic" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to sqlalchemy-alembic+unsubscr...@googlegroups.com.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/sqlalchemy-alembic/2dd0ce93-b4c9-4664-855a-ba81cacd519dn%40googlegroups.com
>  
> .

-- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy-alembic" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy-alembic+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/sqlalchemy-alembic/0c4839d4-0a96-4ece-8cab-d41ae233b3fa%40www.fastmail.com.


[sqlalchemy] Re: compatibility between SQLite and PostgreSQL

2020-08-17 Thread Jonathan Vanasco


On Friday, August 14, 2020 at 10:00:46 AM UTC-4, Alceu Rodrigues de Freitas 
Junior wrote:
 

> I have some projects that are using SQLAlchemy and PostgreSQL and they are 
> running fine until we have to run automated tests that uses PostgreSQL and 
> they took ~5 minutes to complete, even after trying to improve PostgreSQL 
> configuration for better performance.
>
> Those tests are meant to be executed locally, so SQLite (in memory) would 
> be a much faster option.
>

I usually target PostgreSQL production and continuous-integration, and 
SQLite for developer testing and unit-tests..  There are some tradeoffs on 
leveraging SQLlite in-memory vs file-based regarding: locking, concurrency, 
and per-test setups.  In my experience, a file-backed sqlite is better on 
higher-level tests and memory-backed is otherwise fine..


The problem is, creating a model for the SQLite dialect results in 
> something incompatible with PostgreSQL (I'm quite aware that their inner 
> mechanisms are very different).
>
> Is it possible to generate models that could be compatible between the two 
> databases systems, so I could switch from to another depending on the 
> environment? Retaining data is not something required since all data 
> created from the tests will be disposable anyway.
>

Exactly what issues are you having, and why do you need to create a 
different model?  You should be able to use the same model for both.

When it comes to managing core differences between the databases - like the 
very different database functions --  I use the @compiles decorator to 
write custom functions that will compile differently on each backend.

What you're talking about is a very common usage of SqlAlchemy. Please 
share more details so we can help you get up and running.

-- 
SQLAlchemy - 
The Python SQL Toolkit and Object Relational Mapper

http://www.sqlalchemy.org/

To post example code, please provide an MCVE: Minimal, Complete, and Verifiable 
Example.  See  http://stackoverflow.com/help/mcve for a full description.
--- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/sqlalchemy/e0e326d1-fcf9-47c0-a71f-d23fd8401b55o%40googlegroups.com.


[sqlalchemy] Joining and labeling data on multiple tables

2020-08-17 Thread alireza...@gmail.com

Hello,
I have 3 main tables and 3 association tables between them
Main tables: Sites, Categories, Products
Association Tables: Sites<->Categories, Categories<->Products, 
Sites<->ChosenProducts

Here is a link to diagram:
https://dbdiagram.io/d/5f3a863d39081a20ae94d500

Here is my code:
https://gist.github.com/DarkSuniuM/df1cfaeaf7a0745326653078febedc10

I need to join Site to Category and then Category to Product to get all 
products associated with a Site object.
It's doable using the following statement:
`session.query(Product).join(Categories.products).join(Categories.sites).filter(Site.id
 
== 1).all()`

I also am able to get chosein products by just getting the site and then 
using it's `chosen_products` attribute

But I need to label or even create a temporary object for each object
I need to be able to get the difference between available products (by 
categories for a site) and chosein products for a site
I need to have a runtime(?) attribute on each result like "chosed" and have 
a boolean value for them based on their existence in chosen_products of 
website
I need to be able list keywords for each product instead of having multiple 
objects of the same product cause of multiple categories

I don't even know if the thing I wanna do is the right way to do it...
So,
1. How can I join all these table together and be able to get the 
difference and intersect and full result?
2. How can I label them based on their existence in ChoseinProducts of a 
website
3. How can I have intersect of keywords between result (products) and site? 
(I guess this one should be done using AssociationProxy, but not sure how)

Thanks in advance...


-- 
SQLAlchemy - 
The Python SQL Toolkit and Object Relational Mapper

http://www.sqlalchemy.org/

To post example code, please provide an MCVE: Minimal, Complete, and Verifiable 
Example.  See  http://stackoverflow.com/help/mcve for a full description.
--- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/sqlalchemy/d9f8d4a3-dd5f-43c2-9ea0-272771e8b7aan%40googlegroups.com.


Re: [sqlalchemy] Flask SQlAlchemy BaseQuery Paginate not working correctly

2020-08-17 Thread Prerna Pandit
Hey Simon,

I am indeed getting back duplicates of aggregate ids. I reworked my query 
to using exists like you suggested and now I get back correct result 
count.  Thanks so much for your suggestion and saving my day!  I can now 
move ahead on my work ticket.


Many, many, thanks again, appreciate it!!

Prerna

On Saturday, August 15, 2020 at 3:22:02 PM UTC-4, Simon King wrote:
>
> This is the same problem: you're writing a query that joins 3 tables 
> together, and then applying a "LIMIT 20" to that query. If you look 
> carefully at your 20 rows of psql output, I expect you'll see the same 
> aggregates_id appear more than once. There are less than 20 distinct 
> Aggregate objects. When SQLAlchemy receives those rows, it skips the 
> duplicates. 
>
> The SQL looks something like this: 
>
> SELECT ... FROM aggregates 
> LEFT JOIN aggregate_blocks ON ... 
> LEFT JOIN blocks ON ... 
> LIMIT 20 
>
> You need the LIMIT to apply just to the aggregates table, rather than 
> the joined tables. This is a bit tricky because you want to filter by 
> a field in one of those joined tables. You could use an EXISTS 
> function, something like this: 
>
> SELECT * FROM aggregates 
> WHERE EXISTS ( 
>   SELECT 1 
>   FROM aggregate_blocks 
>   INNER JOIN blocks ON aggregate_blocks.block_id = block.id 
>   WHERE aggregate_blocks.aggregate_id = aggregates.id 
>   AND blocks.is_complete = false 
> ) 
> LIMIT 20 
>
> If you run that in psql, I think you should get 20 *different* 
> aggregates rows back. 
>
> If that works, then to turn it into SQLAlchemy ORM syntax, you should 
> use Query.exists(): 
>
>
> https://docs.sqlalchemy.org/en/13/orm/query.html#sqlalchemy.orm.query.Query.exists
>  
>
> Hope that helps, 
>
> Simon 
>
> On Fri, Aug 14, 2020 at 3:56 PM Prerna Pandit  > wrote: 
> > 
> > Hey Simon, 
> > 
> > Thanks so much for replying to my question.  I reworked my code to use 
> sqlalchemy ORM and took off flask and paginate so I can narrow down the 
> issue. My models now extend from declarative_base. 
> > 
> > engine = 
> create_engine('postgresql://postgres:postgres@localhost:5400/postgres') 
> > Session = sessionmaker(bind=engine) 
> > 
> > Base = declarative_base() 
> > session = Session() 
> > 
> > Models 
> > 
> > class Aggregate(Base): 
> > __tablename__ = 'aggregates' 
> > id = Column(UUID(as_uuid=True), primary_key=True, 
> >server_default=text('uuid_generate_v4()')) 
> > site_id = Column(UUID(as_uuid=True), nullable=True) 
> > created_at = Column(DateTime, default=sa.func.now()) 
> > created_by = Column(UUID(as_uuid=True), nullable=True) 
> > updated_at = Column(DateTime, default=sa.func.now(), 
> onupdate=sa.func.now()) 
> > updated_by = Column(UUID(as_uuid=True), nullable=True) 
> > blocks = relationship('AggregateBlock', cascade='all, 
> delete-orphan', 
> >   passive_deletes=True, 
> back_populates='aggregate') 
> > 
> > 
> > class Block(Base): 
> > 
> > __tablename__ = 'blocks' 
> > id = Column(UUID(as_uuid=True), primary_key=True, 
> >server_default=text('uuid_generate_v4()')) 
> > type = Column(Text, nullable=False) 
> > heading = Column(Text, nullable=True) 
> > subheading = Column(Text, nullable=True) 
> > label = Column(Text, nullable=True) 
> > is_complete = Column(Boolean, default=False) 
> > created_at = Column(DateTime, default=sa.func.now()) 
> > created_by = Column(UUID(as_uuid=True), nullable=True) 
> > updated_at = Column(DateTime, default=sa.func.now(), 
> onupdate=sa.func.now()) 
> > updated_by = Column(UUID(as_uuid=True), nullable=True) 
> > aggregates = relationship('AggregateBlock', cascade='all, 
> delete-orphan', 
> >  passive_deletes=True, 
> back_populates='block') 
> > 
> > 
> > class AggregateBlock(Base): 
> > __tablename__ = 'aggregate_blocks' 
> > id = Column(UUID(as_uuid=True), primary_key=True, 
> >server_default=text('uuid_generate_v4()')) 
> > block_id = Column(UUID(as_uuid=True), 
> >  ForeignKey('blocks.id', ondelete='CASCADE'), 
> nullable=False, index=True) 
> > aggregate_id = Column(UUID(as_uuid=True), 
> >  ForeignKey('aggregates.id', 
> ondelete='RESTRICT'), nullable=False) 
> > position = Column(Integer, nullable=False) 
> > block = relationship('Block', back_populates='aggregates') 
> > aggregate = relationship('Aggregate', back_populates='blocks') 
> > 
> > 
> > Query: 
> > 
> > select = session.query(Aggregate).order_by(Aggregate.created_at) \ 
> > .join(AggregateBlock) \ 
> > .join(Block) \ 
> > .filter(Block.is_complete == complete) \ 
> > 
> > all_results = select.all() 
> > limit_results = select.limit(20).all() 
> > 
> > 
> > I still get inconsistent results when I apply limit. Like select.all() 
> will return 47 rows but  with limit it'll return