Hi Michael, Thanks for such a quick reply. I enjoyed reading it! I actually inherited this API (I swear I'm not just making excuses!) from a colleague who left a few months earlier, so it's very much been a case of 'Figuring it out as I go along'.
Apologies for the incomplete code - despite it not being particularly exciting code, I wanted to double check that I'm allowed to post it publicly. So the original (complete) code for this function is as follows: def similar_structure_matches(smiles, similarity_threshold): struc_sim_query = db.select([structures_tbl, text(":q_smiles as query_smiles, jc_tanimoto(canonical_smiles, :q_smiles) as similarity").bindparams(q_smiles=smiles)]). \ where(text("jc_tanimoto(canonical_smiles, :q_smiles) >= :q_sim"). bindparams(q_smiles=smiles, q_sim=similarity_threshold)) struc_sim_res = struc_sim_query.execute().fetchall() if len(struc_sim_res) == 0: return '', 204 returnMatchLimaSchema(many=True).dump(struc_sim_res) The above code is used to generate tanimoto (similarity) scores for the queried structure against each structure in the database( SMILES describe chemical structures). As I understand it, the jc_tanimoto function comes from the Chemaxon Cartridge which we have installed on our Oracle server (Cartridge API | ChemAxon Docs <https://docs.chemaxon.com/display/docs/cartridge-api.md#src-1803469-cartridgeapi-jc-tanimoto> ). I'm not entirely sure how to call this function, without it being wrapped by text(). As I understand it, the code you sent across would be applying the comparison (now jc_tanimoto) function in the Python; not within Oracle itself (of course, that was impossible for you to predict with the incomplete code I sent across). Could you please advise on how to correctly structure this query? Thanks again, Dan On Tuesday, 14 December 2021 at 13:31:12 UTC Mike Bayer wrote: > > > On Tue, Dec 14, 2021, at 5:40 AM, 'Daniel Grindrod' via sqlalchemy wrote: > > Hi all, > > I'm working on a REST API which is built using Flask-SQLAlchemy and > Connexion. I'm fairly new to SQLAlchemy, but it's been brilliant so far > :) This API uses SQLAlchemy 1.3.16, and connects to an Oracle Database (12c > 12.1.0.1.0 64bit). > <https://stackoverflow.com/posts/70341129/timeline> > > I'm having an issue generating the correct SQL from a SQLAlchemy query. I > would really appreciate any help. The troublesome function is shown below. > def similar_matches(input_descriptor, threshold, lim=None, offset): > > query = db.select([tbl, text(":q_descriptors as query_descriptors, > comparison(descriptors, :q_descriptors) as > similarity")bindparams(q_descriptor=input_descriptor).\ > where( text("comparison(descriptors, :q_descriptors) >= > q_threshold").bindparams(q_descriptor=input_descriptor, q_threshold = > threshold) > > > heya - > > it's early here but I almost want to be able to tell a story about that > pattern above, which has select(text("cola, colb, colc, ...)) in it. > It's kind of an "anti-unicorn" for me, as I've done many refactorings to > the result-processing part of SQLAlchemy's engine and each time I do so, > there's some internal handwringing over, "what if someone is SELECTING from > a text() that has multiple columns comma-separated in them?", which > specifically is a problem because it means we can't positionally link the > columns we get back from the cursor to the Python expressions that are in > the select() object, and each time it's like, "nah, nobody would do that", > or, "nah, nobody *should* do that", but yet, as there's not a > straightforward way to detect/warn for that, there's a whole set of code / > commentary at > https://github.com/sqlalchemy/sqlalchemy/blob/main/lib/sqlalchemy/engine/cursor.py#L325 > > which wonders if we'd ever see this. > > and here it is! :) the dark unicorn. So, it's also the source of the > issue here, because the Oracle dialect has to restructure the query to > simulate limit/offset. Soooo.... back into the barn w/ the unicorn and > what we do here is make sure the select() has enough structure so that > SQLAlchemy knows what's going on and here that would look like (note I'm > making some syntactical assumptions about the code above which seems to be > incomplete ): > > from sqlalchemy import literal, func > > query = db.select( > [ > tbl, > literal(input_descriptor).label("query_descriptors"), > func.comparison(tbl.c.descriptors, > q_descriptors).label("similarity") > ]). > where( > func.comparison(tbl.c.descriptors, q_descriptors) >= threshold > > ) > > that way your select() will have .selected_columns entries for every > column in "tbl" plus columns "query_descriptors" and "similarity", and > these will export on outwards to the subquery that the Oracle dialect > creates. > > > > > > > > > > res = query.execute().fetchall() > > if len(res)=0 return '', 204 > > return MatchLimaScheme(many = True).dump(res) > > This SQLAlchemy code takes two inputs (descriptor and threshold), and > searches through each descriptor in an Oracle database, calculating a > similarity measure between the queried descriptor and each stored > descriptor. All rows where similarity score >= threshold are returned in a > JSON. > > The above code works fine, but returns all results - whereas I want to > also be able to include a .offset() and a .limit() (for lazy loading). The > code above generates SQL along these lines: > SELECT ID, last_modified, descriptors, :q_descriptors as > query_descriptors, comparison(descriptors, :q_descriptors) as similarity' > FROM tbl WHERE compare(descriptors, :q_descriptors) >= :q_threshold > > which works well. However, when I add .limit() or .offset() on the end of > my query i.e. > query = db.select([tbl, text(":q_descriptors as query_descriptors, > comparison(descriptors, :q_descriptors) as similarity" > ).bindparams(q_descriptor=input_descriptor).where( > text("comparison(descriptors, :q_descriptors) >= :q_threshold") > .bindparams(q_descriptor=input_descriptor,q_threshold = > threshold).limit(limit) > > The SQL generated changes to be along these lines: > SELECT ID, last_modified, descriptors FROM (SELECT tbl.ID as ID, > tbl.last_modified as last_modified, tbl.descriptors as descriptors, > :q_descriptors as query_descriptors, comparison(descriptors, > :q_descriptors) as similarity) FROM tbl WHERE compare(descriptors, > :q_descriptors) >= :q_threshold WHERE ROWNUM <= :q_limit > > As a raw SQL query this is fine, but I'm no longer including the > query_descriptors and similarity metrics in my SELECT clause. Thus I get a > columnNotFoundError. How do I adjust the .select() function above so that > my SQL looks more like: > SELECT ID, last_modified, descriptors, query_descriptors, similarity FROM > (SELECT tbl.ID as ID, tbl.last_modified as last_modified, tbl.descriptors > as descriptors,:q_descriptors as query_descriptors comparison(descriptors, > :q_descriptors) as similarity' FROM tbl WHERE compare(descriptors, > :q_descriptors) >= :q_threshold WHERE ROWNUM <= :q_limit OFFSET :q_offset > > Basically, I'm looking to explicitly tell SQLAlchemy that I want to SELECT > tbl AND query_descriptors AND similarity. > > I've also been informed that it's bad practice to not include a > .order_by() in these queries, but I don't think that is what's causing the > issue here. It is on my to do list though. > > Please let me know if I need to provide more information. > > Thanks for any help, > > Dan > > > -- > SQLAlchemy - > The Python SQL Toolkit and Object Relational Mapper > > http://www.sqlalchemy.org/ > > To post example code, please provide an MCVE: Minimal, Complete, and > Verifiable Example. See http://stackoverflow.com/help/mcve for a full > description. > --- > You received this message because you are subscribed to the Google Groups > "sqlalchemy" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to sqlalchemy+...@googlegroups.com. > To view this discussion on the web visit > https://groups.google.com/d/msgid/sqlalchemy/adf3ee50-d198-4193-bcfb-e866cad62e52n%40googlegroups.com > > <https://groups.google.com/d/msgid/sqlalchemy/adf3ee50-d198-4193-bcfb-e866cad62e52n%40googlegroups.com?utm_medium=email&utm_source=footer> > . > > > -- SQLAlchemy - The Python SQL Toolkit and Object Relational Mapper http://www.sqlalchemy.org/ To post example code, please provide an MCVE: Minimal, Complete, and Verifiable Example. See http://stackoverflow.com/help/mcve for a full description. --- You received this message because you are subscribed to the Google Groups "sqlalchemy" group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/sqlalchemy/ebbe2a3f-4c2b-4512-a093-870c9d842a8en%40googlegroups.com.