Re: [sqlalchemy] How to SELECT computed data using SQLAlchemy?

'Daniel Grindrod' via sqlalchemy Tue, 14 Dec 2021 09:26:24 -0800

Hi Michael,

Thanks for such a quick reply.
I enjoyed reading it! I actually inherited this API (I swear I'm not just 
making excuses!) from a colleague who left a few months earlier, so it's 
very much been a case of 'Figuring it out as I go along'.


Apologies for the incomplete code - despite it not being particularly 
exciting code, I wanted to double check that I'm allowed to post it 
publicly. 
So the original (complete) code for this function is as follows:

def similar_structure_matches(smiles, similarity_threshold): 

  struc_sim_query = db.select([structures_tbl, text(":q_smiles as 
query_smiles, jc_tanimoto(canonical_smiles, :q_smiles) as 
similarity").bindparams(q_smiles=smiles)]). \      
where(text("jc_tanimoto(canonical_smiles, :q_smiles) >= :q_sim"). 
bindparams(q_smiles=smiles, q_sim=similarity_threshold)) 

  struc_sim_res = struc_sim_query.execute().fetchall()

  if len(struc_sim_res) ==   0: 
    return '', 204 

returnMatchLimaSchema(many=True).dump(struc_sim_res) 

The above code is used to generate tanimoto (similarity) scores for the 
queried structure against each structure in the database( SMILES describe 
chemical structures).
As I understand it, the jc_tanimoto function comes from the Chemaxon 
Cartridge which we have installed on our Oracle server (Cartridge API | 
ChemAxon Docs 
<https://docs.chemaxon.com/display/docs/cartridge-api.md#src-1803469-cartridgeapi-jc-tanimoto>
).

I'm not entirely sure how to call this function, without it being wrapped 
by text().
As I understand it, the code you sent across would be applying the 
comparison (now jc_tanimoto) function in the Python; not within Oracle 
itself (of course, that was impossible for you to predict with the 
incomplete code I sent across).

Could you please advise on how to correctly structure this query?

Thanks again,
Dan

On Tuesday, 14 December 2021 at 13:31:12 UTC Mike Bayer wrote:

>
>
> On Tue, Dec 14, 2021, at 5:40 AM, 'Daniel Grindrod' via sqlalchemy wrote:
>
> Hi all,
>
> I'm working on a REST API which is built using Flask-SQLAlchemy and 
> Connexion. I'm fairly new to SQLAlchemy, but it's been brilliant so far 
> :) This API uses SQLAlchemy 1.3.16, and connects to an Oracle Database (12c 
> 12.1.0.1.0 64bit). 
> <https://stackoverflow.com/posts/70341129/timeline>
>
> I'm having an issue generating the correct SQL from a SQLAlchemy query. I 
> would really appreciate any help. The troublesome function is shown below.
> def similar_matches(input_descriptor, threshold, lim=None, offset): 
>
>   query = db.select([tbl, text(":q_descriptors as query_descriptors, 
> comparison(descriptors, :q_descriptors) as 
> similarity")bindparams(q_descriptor=input_descriptor).\
>   where( text("comparison(descriptors, :q_descriptors) >=  
> q_threshold").bindparams(q_descriptor=input_descriptor, q_threshold = 
> threshold) 
>
>
> heya -
>
> it's early here but I almost want to be able to tell a story about that 
> pattern above, which has select(text("cola, colb, colc, ...))  in it.   
> It's kind of an "anti-unicorn" for me, as I've done many refactorings to 
> the result-processing part of SQLAlchemy's engine and each time I do so, 
> there's some internal handwringing over, "what if someone is SELECTING from 
> a text() that has multiple columns comma-separated in them?", which 
> specifically is a problem because it means we can't positionally link the 
> columns we get back from the cursor to the Python expressions that are in 
> the select() object, and each time it's like, "nah, nobody would do that", 
> or, "nah, nobody *should* do that", but yet, as there's not a 
> straightforward way to detect/warn for that, there's a whole set of code / 
> commentary at 
> https://github.com/sqlalchemy/sqlalchemy/blob/main/lib/sqlalchemy/engine/cursor.py#L325
>  
> which wonders if we'd ever see this.   
>
> and here it is!  :)   the dark unicorn.    So, it's also the source of the 
> issue here, because the Oracle dialect has to restructure the query to 
> simulate limit/offset.   Soooo.... back into the barn w/ the unicorn and 
> what we do here is make sure the select() has enough structure so that 
> SQLAlchemy knows what's going on and here that would look like (note I'm 
> making some syntactical assumptions about the code above which seems to be 
> incomplete ):
>
> from sqlalchemy import literal, func
>
> query = db.select(
>     [
>         tbl,
>         literal(input_descriptor).label("query_descriptors"),
>         func.comparison(tbl.c.descriptors, 
> q_descriptors).label("similarity")
>     ]).
>   where(
>     func.comparison(tbl.c.descriptors, q_descriptors) >= threshold
>
>   )
>
> that way your select() will have .selected_columns entries for every 
> column in "tbl" plus columns "query_descriptors" and "similarity", and 
> these will export on outwards to the subquery that the Oracle dialect 
> creates.
>
>
>
>
>
>
>
>
>
>   res = query.execute().fetchall() 
>
>   if len(res)=0 return '', 204 
>
>   return MatchLimaScheme(many = True).dump(res) 
>
> This SQLAlchemy code takes two inputs (descriptor and threshold), and 
> searches through each descriptor in an Oracle database, calculating a 
> similarity measure  between the queried descriptor and each stored 
> descriptor. All rows where similarity score >= threshold are returned in a 
> JSON.
>
> The above code works fine, but returns all results - whereas I want to 
> also be able to include a .offset() and a .limit() (for lazy loading). The 
> code above generates SQL along these lines:
> SELECT ID, last_modified, descriptors, :q_descriptors as 
> query_descriptors, comparison(descriptors, :q_descriptors) as similarity' 
> FROM tbl WHERE compare(descriptors, :q_descriptors) >= :q_threshold 
>
> which works well. However, when I add .limit() or .offset() on the end of 
> my query i.e.
> query = db.select([tbl, text(":q_descriptors as query_descriptors, 
> comparison(descriptors, :q_descriptors) as similarity" 
> ).bindparams(q_descriptor=input_descriptor).where( 
> text("comparison(descriptors, :q_descriptors) >= :q_threshold") 
> .bindparams(q_descriptor=input_descriptor,q_threshold = 
> threshold).limit(limit) 
>
> The SQL generated changes to be along these lines: 
> SELECT ID, last_modified, descriptors FROM (SELECT tbl.ID as ID, 
> tbl.last_modified as last_modified, tbl.descriptors as descriptors, 
> :q_descriptors as query_descriptors, comparison(descriptors, 
> :q_descriptors) as similarity) FROM tbl WHERE compare(descriptors, 
> :q_descriptors) >= :q_threshold WHERE ROWNUM <= :q_limit 
>
> As a raw SQL query this is fine, but I'm no longer including the 
> query_descriptors and similarity metrics in my SELECT clause. Thus I get a 
> columnNotFoundError. How do I adjust the .select() function above so that 
> my SQL looks more like:
> SELECT ID, last_modified, descriptors, query_descriptors, similarity FROM 
> (SELECT tbl.ID as ID, tbl.last_modified as last_modified, tbl.descriptors 
> as descriptors,:q_descriptors as query_descriptors comparison(descriptors, 
> :q_descriptors) as similarity' FROM tbl WHERE compare(descriptors, 
> :q_descriptors) >= :q_threshold WHERE ROWNUM <= :q_limit OFFSET :q_offset 
>
> Basically, I'm looking to explicitly tell SQLAlchemy that I want to SELECT 
> tbl AND query_descriptors AND similarity.
>
> I've also been informed that it's bad practice to not include a 
> .order_by() in these queries, but I don't think that is what's causing the 
> issue here. It is on my to do list though.
>
> Please let me know if I need to provide more information.
>
> Thanks for any help,
>
> Dan
>
>
> -- 
> SQLAlchemy - 
> The Python SQL Toolkit and Object Relational Mapper
>  
> http://www.sqlalchemy.org/
>  
> To post example code, please provide an MCVE: Minimal, Complete, and 
> Verifiable Example. See http://stackoverflow.com/help/mcve for a full 
> description.
> --- 
> You received this message because you are subscribed to the Google Groups 
> "sqlalchemy" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to sqlalchemy+...@googlegroups.com.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/sqlalchemy/adf3ee50-d198-4193-bcfb-e866cad62e52n%40googlegroups.com
>  
> <https://groups.google.com/d/msgid/sqlalchemy/adf3ee50-d198-4193-bcfb-e866cad62e52n%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>
>
>

-- 
SQLAlchemy - 
The Python SQL Toolkit and Object Relational Mapper

http://www.sqlalchemy.org/

To post example code, please provide an MCVE: Minimal, Complete, and Verifiable 
Example.  See  http://stackoverflow.com/help/mcve for a full description.
--- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/sqlalchemy/ebbe2a3f-4c2b-4512-a093-870c9d842a8en%40googlegroups.com.

Re: [sqlalchemy] How to SELECT computed data using SQLAlchemy?

Reply via email to