Hi All. I have a select query that uses "subqueryload" and looks like this:
completed_imports = self.ra_import_file.visible() \ .filter(ImportFile.lock_date == None) \ .filter(ImportFile.process_date != None) \ .order_by(ImportFile.process_date.desc()) \ .options(subqueryload('user')) \ .all() However, intermittently I am finding results from the query where process_date is None (NULL) in production, which seems weird, since I am using .filter(ImportFile.process_date != None). This table is quite busy, and there can be quite a few rows that turn into "complete imports" by changing their process_date from NULL to an actual date. Because I am using subqueryload(), two SQL statements are issued: First query: SELECT import_file.create_date ... FROM import_file WHERE import_file.import_file_id IN (SELECT import_file.import_file_id AS import_file_import_file_id FROM import_file WHERE import_file.lock_date IS NULL AND import_file.process_date IS NOT NULL ORDER BY import_file.process_date DESC Second query: SELECT "user".password AS user_password... , anon_1.import_file_user_id AS anon_1_import_file_user_id FROM (SELECT import_file.user_id AS import_file_user_id FROM import_file WHERE import_file.import_file_id IN (SELECT import_file.import_file_id AS import_file_import_file_id FROM import_file WHERE import_file.lock_date IS NULL AND import_file.process_date IS NOT NULL) AS anon_1 JOIN "user" ON anon_1.import_file_user_id = "user".user_id ORDER BY anon_1.import_file_user_id My [unconfirmed] theory why I occasionally get rows returned that have NULL process_dates is that race conditions like this happen when under high load: 1) Row 'foo' in the file_import table has process_date = NULL 2) First SQL query runs (SELECT import_file.create_date ...) and the 'foo' row is not returned in the result set because it does not match the "AND import_file.process_date IS NOT NULL" where clause. 3) External process completes import and sets process_date = <current_date> for row 'foo' 4) Second SQL query runs (SELECT "user".password AS user_password...) and the 'foo' row is now in the result set because it matches the "AND import_file.process_date IS NOT NULL" where clause. 5) "subqueryload" joins both SQL queries together in results in python (sorry, I got a bit lost trying to trace this in sqlalchemy code), resulting in row 'foo' being in the results, but missing values such as 'process_date', which are only queried in the first SQL query. Am I completely wrong? Does subqueryload() only return results if each row was returned in both SQL queries? Thanks. - Paul -- You received this message because you are subscribed to the Google Groups "sqlalchemy" group. To post to this group, send email to sqlalchemy@googlegroups.com. To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en.