Re: [sqlalchemy] MSSQL String columns often incorrectly encoded as NVARCHAR

2019-03-31 Thread Ian Wagner
Thanks. I’ll pursue those avenues.

Cheers,
Ian

2019년 4월 1일 (월) 11:30, Mike Bayer 님이 작성:

> On Sun, Mar 31, 2019 at 10:12 PM Ian Wagner 
> wrote:
> >
> >
> > My suggestion would be a pymssql dialect-level patch to send bytestrings
> for String columns, and of course continue passing str/unicode for Unicode
> columns. I'm on the mailing list looking for help with why my solution
> doesn't work as intended with enums (see my GitHub repo). For the moment,
> I've actually just implemented this column myself as shown, and replaced
> all String columns with it (rather than "replacing" String using colspecs),
> and banned all devs on our project from using the regular String column
> type until it's fixed.
>
> Please just use a TypeDecorator for all your String columns where this
> issue is apparent:
>
> class UTF8String(TypeDecorator):
> impl = String
>
> def process_bind_param(self, value, dialect):
> if value is not None:
> value = value.encode(dialect.encoding)
> return value
>
> next, the issue should be reported to pymssql, where they should
> likely include options to modify this behavior:
>
> https://github.com/pymssql/pymssql
>
> next, I need this to be confirmed as an issue for pyodbc.As my
> understanding is that Microsoft is funding Pyodbc's development I'd
> like to see what their approach to this issue is.
>
> *if* it is truly the canonical solution that applications that code
> against these DBAPIs *must* send byte arrays in order to avoid
> crushing performance costs, the drivers need to make that clear.
> then we can look into introducing either flags or permanent behavior
> such that we encode all Python unicode objects for a *non unicode*
> String datatype under the MSSQL dialects.
>
>
>
> >
> >> Also note pymssql is not well maintained right now due to lack of
> >> funding
> >
> >
> > Noted. We'll look into switching drivers yet again, but the official
> driver lacked several features the last time we evaluated it (I think
> stored proc output parameters were not supported, and it would require
> quite a few syntax changes in areas where we have to write raw SQL).
> >
> > Cheers,
> > Ian
> >
> > On Friday, March 29, 2019 at 11:10:21 PM UTC+9, Mike Bayer wrote:
> >>
> >> Also note pymssql is not well maintained right now due to lack of
> >> funding, please confirm you reproduce your performance concerns using
> >> PyODBC with Microsofts ODBC drivers ?  That should be considered to be
> >> the canonically supported driver right now, works on all platforms
> >> very well now.
> >>
> >> On Fri, Mar 29, 2019 at 10:07 AM Mike Bayer 
> wrote:
> >> >
> >> > OK so I saw that the "N" prefix is not generated with your test case
> >> > either, so I re-read your email.  Can you clarify what you mean by
> >> > "always encoded as NVARCHAR"?   are you referring to the simple fact
> >> > that a Python string object is passed to the driver, and that the
> >> > driver is behind the scenes applying the "N" in any case or is
> >> > otherwise binding it as unicode in such a way that performance is
> >> > impacted?   SQLAlchemy for many years passed bytestrings to drivers
> >> > like pyodbc because they would simply crash if you passed them a
> >> > unicode object, but once they supported it, SQLAlchemy was eager to
> >> > get out of the business of doing this encoding.In 1.3 we've just
> >> > deprecated all the flags that allow it to do this
> >> > (convert_unicode=True).Using that flag would be your quickest way
> >> > to get it back for now but we'd have to develop an all new behavior
> >> > for 1.4 if we are to start encoding these binds again, however current
> >> > behavior has been this way for many years and this is the first it's
> >> > being reported in this way.  I would want to look into driver
> >> > configuration options for this as well.
> >> >
> >> >
> >> >
> >> >
> >> > On Fri, Mar 29, 2019 at 9:56 AM Mike Bayer 
> wrote:
> >> > >
> >> > > On Fri, Mar 29, 2019 at 6:20 AM Ian Wagner 
> wrote:
> >> > > >
> >> > > > Hello all,
> >> > > >
> >> > > > I'm trying to get to the bottom of an issue in which Python 3
> (unicode by definition) strings are always encoded as NVARCHAR for at least
> two backends (pymssql and pyodbc). Using bytstring

Re: [sqlalchemy] MSSQL String columns often incorrectly encoded as NVARCHAR

2019-03-31 Thread Ian Wagner
Thanks for the reply, Mike!

the NVARCHAR thing should not happen if you are comparing to a 
> non-NVARCHAR column.   it only occurs when there is no other context 
> that SQLAlchemy can determine the correct datatype for the Unicode 
> object being passed.


This was my impression as well. I am, as you can see in my MCVE, using the 
ORM, not Core. I am using a String column, which is supposed to map to a 
VARCHAR, whereas Unicode would map to NVARCHAR.

OK so I saw that the "N" prefix is not generated with your test case 
> either, so I re-read your email.  Can you clarify what you mean by 
> "always encoded as NVARCHAR"?   are you referring to the simple fact 
> that a Python string object is passed to the driver, and that the 
> driver is behind the scenes applying the "N" in any case or is 
> otherwise binding it as unicode in such a way that performance is 
> impacted?


That's correct; I made it very clear in my SO post that the underlying 
issue is passing a python(3) str to the driver, and that under the hood the 
driver is prefixing it. The SQL you see generated by SQLAlchemy is NOT the 
same SQL that actually gets sent to SQL Server (this is not the fault of 
SQLAlchemy, and is beyond the scope of this discussion, but in case you're 
trying to debug it this way, you won't have much luck; you need to check 
the SQL Server profiler or recent expensive queries list or some other 
method). Here's a direct link to the offending line in 
pymssql: 
https://github.com/pymssql/pymssql/blob/891b20e29e4e247c17b202e8e34e5c739b6090ef/src/_mssql.pyx#L1779.
 
pymssql isn't necessarily wrong in its behavior; it's a low-level driver 
and doesn't have the same kind of context SQLAlchemy does. This will *very* 
negatively impact performance on a table with a lot of rows (in our case, 
it was a very simple index lookup of a ~1.7million row table).

I noticed the unicode options, and also their deprecation status. Hence my 
hesitation in using those. I also completely understand not wanting to pass 
bytestrings in general. However, this is a clear limitation in pymssql and, 
by some reports, pyodbc as well. I don't know which underlying ODBC driver 
the other guy on SO was using past the python layer though.

but we'd have to develop an all new behavior 
> for 1.4 if we are to start encoding these binds again, however current 
> behavior has been this way for many years and this is the first it's 
> being reported in this way.  I would want to look into driver 
> configuration options for this as well. 


If you check git blame for the pymssql line in question, it's been this way 
for 6 years; it's a very old "problem." I would suggest that nobody has 
noticed it due to lack of scale with a bad database schema. The reason we 
noticed this is because we're working with a 20-year-old schema that 
evolved over the years without any oversight, and has a mess of things like 
string primary keys that we can't easily change. We just happened to notice 
this because we're transitioning the main application logic from classic 
ASP to Python and our core tables are set up rather poorly.

My suggestion would be a pymssql dialect-level patch to send bytestrings 
for String columns, and of course continue passing str/unicode for Unicode 
columns. I'm on the mailing list looking for help with why my solution 
doesn't work as intended with enums (see my GitHub repo). For the moment, 
I've actually just implemented this column myself as shown, and replaced 
all String columns with it (rather than "replacing" String using colspecs), 
and banned all devs on our project from using the regular String column 
type until it's fixed.

Also note pymssql is not well maintained right now due to lack of 
> funding


Noted. We'll look into switching drivers yet again, but the official driver 
lacked several features the last time we evaluated it (I think stored proc 
output parameters were not supported, and it would require quite a few 
syntax changes in areas where we have to write raw SQL).

Cheers,
Ian

On Friday, March 29, 2019 at 11:10:21 PM UTC+9, Mike Bayer wrote:
>
> Also note pymssql is not well maintained right now due to lack of 
> funding, please confirm you reproduce your performance concerns using 
> PyODBC with Microsofts ODBC drivers ?  That should be considered to be 
> the canonically supported driver right now, works on all platforms 
> very well now. 
>
> On Fri, Mar 29, 2019 at 10:07 AM Mike Bayer  > wrote: 
> > 
> > OK so I saw that the "N" prefix is not generated with your test case 
> > either, so I re-read your email.  Can you clarify what you mean by 
> > "always encoded as NVARCHAR"?   are you referring to the simple fact 
> > that a Python string object is passed to the driver, and that the 
> > driver is behind the scenes applying the "N

[sqlalchemy] Re: Using operators with multiple InstrumentedAttribute instances

2019-03-29 Thread Ian Miller
The code that makes up the query builder I've been working on is pretty 
extensive, so I'll go through the high-level basics.

I've set up a `_base` method that augments the SQLALchemy Base model. Note 
the column details retrived in `get_column_and_json_key_from_sql_name` class 
method:

class _base:
"""
This class augments the default SQLAlchemy Base model
"""

@classmethod
def get_column_and_json_key_from_sql_name(cls, name):
"""
Returns the column and, if applicable, the JSON top-level key from the 
JSON dict.

:param name: Name of field.

:return:
"""
assert type(name) == str

json_key = None
col_names = name.split(COLUMN_NAME_DELIMITER)
if len(col_names) == 2:
name, json_key = col_names

try:
col = cls.__table__.c[name]
except KeyError:
log.error("Invalid column name: %s", name)
return None

return (getattr(cls, col.name), json_key)


We then have a `_build_column` method that essentially is responsible for 
retrieving the details necessary to construct the column for the select 
statement necessary for constructing the SQLAlchemy ORM query:

def _build_column(self):
field_name = self.db.get("column")
model = self._object.get("model")

column_type, column_key = self.db.get("type"), self.db.get("key")
select_column, json_key = model.get_column_and_json_key_from_sql_name(
field_name
)
select_column = self._construct_json_select_field(
column_type, select_column, json_key, column_key
)

return select_column


What I'm trying to figure out is how to dynamically generate SQLAlchemy ORM 
statements based on the formula. The formulas can be any math equation 
using +, -, /, *, and parentheses:

formula1 = '"metric:1" + "metric:2" + "metric:3"'
formula2 = '"metric:1" + "metric:2" - "metric:3"'
formula3 = '"metric:1" + ("metric:2" * "metric:3")'
formula4 = '"metric:1" / "metric:2"'
formula5 = '"metric:1 / ("metric:2" * "metric:3")'

The InstrumentedAttribute objects I mentioned earlier are the select fields 
for each individual metric - what I need to figure out how to do is to be 
able to build a SQLAlchemy ORM query by parsing the formula string, and 
evaluating each operation in the context of the query. `func.sum` would 
work for formula1, but I'd need to iteratively build the formula with 
SQLAlchemy ORM helper methods for formula2 - formula5.

Per +Jonathan Vanasco's last comment, I've already figured out Phase 1. I'm 
stuck on Phase 2.

On Thursday, March 28, 2019 at 4:43:56 PM UTC-4, Ian Miller wrote:
>
> Hello all,
>
> I am in the process of trying to create a dynamic expression query engine 
> in an application I'm working on.
>
> So there is a formula that gets defined like so: 
>
> formula = '"metric:123" + "metric:456" + "metric:789"'
>
> Each metric maps to a column in the database tables - long story short, 
> I'm able to retrieve the metric by ID, and instantiate an 
> InstrumentedAttribute object that has the SQLAlchemy metadata for the 
> associated column. What I'm trying to achieve is to be able to iterate 
> through the formula, and dynamically build a SQLALchemy query that maps to 
> the formula.
>
> For example, the formula defined above would look something like this in 
> SQL:
>
> SELECT post.id + campaign.id + asset.id
> FROM post, campaign, asset
> WHERE ..;
>
> The idea is to translate the above to something like:
>
> session.query( 0x7ff9269f92b0> +  at 0x7ff9269c5990> +  object at 0x7ff926896048>).all()
>
> I've tried a couple of approaches of dynamically generating the SQLAlchemy 
> ORM query, but I haven't been able to find anything that works. Would 
> anyone have any idea or tips on how to accomplish this?
>
> Thank you!
>
>

-- 
SQLAlchemy - 
The Python SQL Toolkit and Object Relational Mapper

http://www.sqlalchemy.org/

To post example code, please provide an MCVE: Minimal, Complete, and Verifiable 
Example.  See  http://stackoverflow.com/help/mcve for a full description.
--- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at https://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.


[sqlalchemy] MSSQL String columns often incorrectly encoded as NVARCHAR

2019-03-29 Thread Ian Wagner
Hello all,

I'm trying to get to the bottom of an issue in which Python 3 (unicode by 
definition) strings are always encoded as NVARCHAR for at least two 
backends (pymssql and pyodbc). Using bytstrings as comparison arguments 
(for example Table.column == value.encode('utf-8')) sends a regular string 
literal as expected, but regular strings are encoded as NVARCHAR literals.

This behavior is fairly logical at the underlying driver (pymssql or 
pyodbc) level, which is why I'm posting here. I believe the the use of a 
String column (as opposed to a Unicode column) type should not pass an 
NVARCHAR literal. Doing so has disastrous performance implications, as SQL 
Server ends up casting the whole column up. This will wreak havoc when 
regularly dealing with large-ish tables (1.7 million rows or so in our 
case).

I have previously posted with a LOT more details on StackOverflow 
(https://stackoverflow.com/questions/55098426/strings-used-in-query-always-sent-with-nvarchar-syntax-even-if-the-underlying-c).
 
I also have an MCVE over on GitHub 
(https://github.com/ianthetechie/pymssql_sqlalchemy_55098426).

In my MCVE, I outline a possible approach for fixing this, but it appears 
to have some problems. I'm posting here asking for feedback on what's wrong 
with my approach, and what would be the best way to go about getting this 
fixed.

Thanks!
Ian

-- 
SQLAlchemy - 
The Python SQL Toolkit and Object Relational Mapper

http://www.sqlalchemy.org/

To post example code, please provide an MCVE: Minimal, Complete, and Verifiable 
Example.  See  http://stackoverflow.com/help/mcve for a full description.
--- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at https://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.


[sqlalchemy] Using operators with multiple InstrumentedAttribute instances

2019-03-28 Thread Ian Miller
Hello all,

I am in the process of trying to create a dynamic expression query engine 
in an application I'm working on.

So there is a formula that gets defined like so: 

formula = '"metric:123" + "metric:456" + "metric:789"'

Each metric maps to a column in the database tables - long story short, I'm 
able to retrieve the metric by ID, and instantiate an InstrumentedAttribute 
object that has the SQLAlchemy metadata for the associated column. What I'm 
trying to achieve is to be able to iterate through the formula, and 
dynamically build a SQLALchemy query that maps to the formula.

For example, the formula defined above would look something like this in 
SQL:

SELECT post.id + campaign.id + asset.id
FROM post, campaign, asset
WHERE ..;

The idea is to translate the above to something like:

session.query( +  + ).all()

I've tried a couple of approaches of dynamically generating the SQLAlchemy 
ORM query, but I haven't been able to find anything that works. Would 
anyone have any idea or tips on how to accomplish this?

Thank you!

-- 
SQLAlchemy - 
The Python SQL Toolkit and Object Relational Mapper

http://www.sqlalchemy.org/

To post example code, please provide an MCVE: Minimal, Complete, and Verifiable 
Example.  See  http://stackoverflow.com/help/mcve for a full description.
--- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at https://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.


Re: [sqlalchemy] Re: Unable to reference label in subquery at top level of query

2019-02-07 Thread Ian Miller
:facepalm: i should have seen that much sooner. Thank you so much for your 
help. No longer need to do any of that crazy stuff I attempted!

On Thursday, February 7, 2019 at 5:13:41 PM UTC-5, Mike Bayer wrote:
>
> On Thu, Feb 7, 2019 at 2:49 PM Ian Miller  > wrote: 
> > 
> > The reason why I iterated over .c._all_columns was because it shows the 
> references to the 2 "metadata_value" columns, whereas selectable.c only 
> references 1. 
>
>
> so that's a sign that you need to apply labels to these columns 
> because they are conflicting.   with select(), call 
> select().apply_labels() so that these two columns are prefixed with 
> their table name in a label. 
>
>
> > 
> > For example, selectable.c shows the following: 
> > 
> > (Pdb++) list(query.c) 
> > [Column('metadata_value', VARCHAR(length=255), 
> table=), Column('created_at', TIMESTAMP(), 
> table=),  0x7f15e1f6fef0; %(139731962070520 coalesce)s>]Enter code here... 
> > 
> > selectable.c._all_columns shows the following: 
> > 
> > (Pdb++) query.c._all_columns 
> > [Column('metadata_value', VARCHAR(length=255), 
> table=), Column('metadata_value', VARCHAR(length=255), 
> table=), Column('created_at', TIMESTAMP(), 
> table=),  0x7f15e1f6fef0; %(139731962070520 coalesce)s>]Enter code here... 
> > 
> > 
> > 
> > 
> > On Thursday, February 7, 2019 at 12:53:51 PM UTC-5, Mike Bayer wrote: 
> >> 
> >> just as a note, don't access .c._all_columns, just iterate over 
> selectable.c 
> >> 
> >> On Thu, Feb 7, 2019 at 10:45 AM Ian Miller  wrote: 
> >> > 
> >> > Hello Mike, 
> >> > 
> >> > Thank you for your response! I have currently constructed the ORM 
> implementation that you suggested in your response. Here's the code: 
> >> > 
> >> > def _rebuild_non_interval_query_for_group_by(self, session, query): 
> >> > from sqlalchemy import table, column, select 
> >> > from sqlalchemy.orm import aliased 
> >> > from collections import defaultdict, OrderedDict 
> >> > 
> >> > post_metadata = table( 
> >> > "post_metadata", column("post_id"), column("metadata_value") 
> >> > ) 
> >> > campaign_metadata = table( 
> >> > "campaign_metadata", column("campaign_id"), 
> column("metadata_value") 
> >> > ) 
> >> > asset_metadata = table( 
> >> > "asset_metadata", column("asset_id"), 
> column("metadata_value") 
> >> > ) 
> >> > 
> >> > vw_asset = table("vw_asset", column("id")) 
> >> > vw_campaign = table("vw_campaign", column("id")) 
> >> > vw_post = table("vw_post", column("id")) 
> >> > 
> >> > METADATA_PRIMARY_TABLE_MAP = { 
> >> > asset_metadata.name: vw_asset, 
> >> > campaign_metadata.name: vw_campaign, 
> >> > post_metadata.name: vw_post, 
> >> > } 
> >> > 
> >> > METADATA_NAME_TABLE_MAP = { 
> >> > asset_metadata.name: asset_metadata, 
> >> > campaign_metadata.name: campaign_metadata, 
> >> > post_metadata.name: post_metadata, 
> >> > } 
> >> > 
> >> > primary_tables = set() 
> >> > metadata_columns_count = defaultdict(int) 
> >> > metadata_alias = OrderedDict() 
> >> > columns = [] 
> >> > for c in query.c._all_columns: 
> >> > if c.name == "metadata_value": 
> >> > parent_column = list(c.base_columns)[0] 
> >> > table = parent_column.table 
> >> > primary_tables.add(METADATA_PRIMARY_TABLE_MAP[table.name]) 
>
> >> > metadata_columns_count[METADATA_NAME_TABLE_MAP[table.name]] 
> += 1 
> >> > alias_number = 
> metadata_columns_count[METADATA_NAME_TABLE_MAP[table.name]] 
> >> > alias_name = "{}_{}".format(table.name, alias_number) 
> >> > alias = aliased(parent_column.table, alias_name) 
> >> > metadata_alias[alias_name] = alias 
> >> > column = alias.c.metadata_value.label( 
> >> > "{}_{}_{}".format(table.name, alias_number, 
> "metadata_value&

Re: [sqlalchemy] Re: Unable to reference label in subquery at top level of query

2019-02-07 Thread Ian Miller
The reason why I iterated over .c._all_columns was because it shows the 
references to the 2 "metadata_value" columns, whereas selectable.c only 
references 1.

For example, selectable.c shows the following:

(Pdb++) list(query.c)
[Column('metadata_value', VARCHAR(length=255), table=), 
Column('created_at', TIMESTAMP(), table=), ]

selectable.c._all_columns shows the following:

(Pdb++) query.c._all_columns
[Column('metadata_value', VARCHAR(length=255), table=), 
Column('metadata_value', VARCHAR(length=255), table=), 
Column('created_at', TIMESTAMP(), table=), ]

That way I'm able to keep track of the position of the column in the query, 
and replace with the rebuilt column accordingly.

On Thursday, February 7, 2019 at 12:53:51 PM UTC-5, Mike Bayer wrote:
>
> just as a note, don't access .c._all_columns, just iterate over 
> selectable.c 
>
> On Thu, Feb 7, 2019 at 10:45 AM Ian Miller  > wrote: 
> > 
> > Hello Mike, 
> > 
> > Thank you for your response! I have currently constructed the ORM 
> implementation that you suggested in your response. Here's the code: 
> > 
> > def _rebuild_non_interval_query_for_group_by(self, session, query): 
> > from sqlalchemy import table, column, select 
> > from sqlalchemy.orm import aliased 
> > from collections import defaultdict, OrderedDict 
> > 
> > post_metadata = table( 
> > "post_metadata", column("post_id"), column("metadata_value") 
> > ) 
> > campaign_metadata = table( 
> > "campaign_metadata", column("campaign_id"), 
> column("metadata_value") 
> > ) 
> > asset_metadata = table( 
> > "asset_metadata", column("asset_id"), column("metadata_value") 
> > ) 
> > 
> > vw_asset = table("vw_asset", column("id")) 
> > vw_campaign = table("vw_campaign", column("id")) 
> > vw_post = table("vw_post", column("id")) 
> > 
> > METADATA_PRIMARY_TABLE_MAP = { 
> > asset_metadata.name: vw_asset, 
> > campaign_metadata.name: vw_campaign, 
> > post_metadata.name: vw_post, 
> > } 
> > 
> > METADATA_NAME_TABLE_MAP = { 
> > asset_metadata.name: asset_metadata, 
> > campaign_metadata.name: campaign_metadata, 
> > post_metadata.name: post_metadata, 
> > } 
> > 
> > primary_tables = set() 
> > metadata_columns_count = defaultdict(int) 
> > metadata_alias = OrderedDict() 
> > columns = [] 
> > for c in query.c._all_columns: 
> > if c.name == "metadata_value": 
> > parent_column = list(c.base_columns)[0] 
> > table = parent_column.table 
> > primary_tables.add(METADATA_PRIMARY_TABLE_MAP[table.name]) 
> > metadata_columns_count[METADATA_NAME_TABLE_MAP[table.name]] 
> += 1 
> > alias_number = 
> metadata_columns_count[METADATA_NAME_TABLE_MAP[table.name]] 
> > alias_name = "{}_{}".format(table.name, alias_number) 
> > alias = aliased(parent_column.table, alias_name) 
> > metadata_alias[alias_name] = alias 
> > column = alias.c.metadata_value.label( 
> > "{}_{}_{}".format(table.name, alias_number, 
> "metadata_value") 
> > ) 
> > columns.append(column) 
> > else: 
> > columns.append(c) 
> > 
> > # start constructing query 
> > non_interval_query = 
> session.query(*columns).select_from(*primary_tables) 
> > 
> > for alias_name, alias in metadata_alias.items(): 
> > object_type = self._get_object_type_from_metadata_name(
> alias.original.name) 
> > non_interval_query = ( 
> > non_interval_query 
> > .join( 
> > alias, 
> > getattr(alias.c, "{}_id".format(object_type)) == 
> METADATA_PRIMARY_TABLE_MAP[alias.original.name].c.id 
> > ) 
> > ) 
> > 
> > non_interval_query = 
> non_interval_query.subquery("non_interval_query") 
> > 
> > return non_interval_query 
> > 
> > 
> > 
> > The "metadata_alias" values are [('post_metadata_1", alias), 
> ('post_metadata_2', alias)] - the alias correspond to the post_metadata_1 
> and post_metadata_2 alias in your example. However, when I reference these 
> in the join, the 

Re: [sqlalchemy] Re: Unable to reference label in subquery at top level of query

2019-02-07 Thread Ian Miller
The reason why I iterated over .c._all_columns was because it shows the 
references to the 2 "metadata_value" columns, whereas selectable.c only 
references 1.

For example, selectable.c shows the following:

(Pdb++) list(query.c)
[Column('metadata_value', VARCHAR(length=255), table=), 
Column('created_at', TIMESTAMP(), table=), ]
Enter code here...

selectable.c._all_columns shows the following:

(Pdb++) query.c._all_columns
[Column('metadata_value', VARCHAR(length=255), table=), 
Column('metadata_value', VARCHAR(length=255), table=), 
Column('created_at', TIMESTAMP(), table=), ]
Enter code here...




On Thursday, February 7, 2019 at 12:53:51 PM UTC-5, Mike Bayer wrote:
>
> just as a note, don't access .c._all_columns, just iterate over 
> selectable.c 
>
> On Thu, Feb 7, 2019 at 10:45 AM Ian Miller  > wrote: 
> > 
> > Hello Mike, 
> > 
> > Thank you for your response! I have currently constructed the ORM 
> implementation that you suggested in your response. Here's the code: 
> > 
> > def _rebuild_non_interval_query_for_group_by(self, session, query): 
> > from sqlalchemy import table, column, select 
> > from sqlalchemy.orm import aliased 
> > from collections import defaultdict, OrderedDict 
> > 
> > post_metadata = table( 
> > "post_metadata", column("post_id"), column("metadata_value") 
> > ) 
> > campaign_metadata = table( 
> > "campaign_metadata", column("campaign_id"), 
> column("metadata_value") 
> > ) 
> > asset_metadata = table( 
> > "asset_metadata", column("asset_id"), column("metadata_value") 
> > ) 
> > 
> > vw_asset = table("vw_asset", column("id")) 
> > vw_campaign = table("vw_campaign", column("id")) 
> > vw_post = table("vw_post", column("id")) 
> > 
> > METADATA_PRIMARY_TABLE_MAP = { 
> > asset_metadata.name: vw_asset, 
> > campaign_metadata.name: vw_campaign, 
> > post_metadata.name: vw_post, 
> > } 
> > 
> > METADATA_NAME_TABLE_MAP = { 
> > asset_metadata.name: asset_metadata, 
> > campaign_metadata.name: campaign_metadata, 
> > post_metadata.name: post_metadata, 
> > } 
> > 
> > primary_tables = set() 
> > metadata_columns_count = defaultdict(int) 
> > metadata_alias = OrderedDict() 
> > columns = [] 
> > for c in query.c._all_columns: 
> > if c.name == "metadata_value": 
> > parent_column = list(c.base_columns)[0] 
> > table = parent_column.table 
> > primary_tables.add(METADATA_PRIMARY_TABLE_MAP[table.name]) 
> > metadata_columns_count[METADATA_NAME_TABLE_MAP[table.name]] 
> += 1 
> > alias_number = 
> metadata_columns_count[METADATA_NAME_TABLE_MAP[table.name]] 
> > alias_name = "{}_{}".format(table.name, alias_number) 
> > alias = aliased(parent_column.table, alias_name) 
> > metadata_alias[alias_name] = alias 
> > column = alias.c.metadata_value.label( 
> > "{}_{}_{}".format(table.name, alias_number, 
> "metadata_value") 
> > ) 
> > columns.append(column) 
> > else: 
> > columns.append(c) 
> > 
> > # start constructing query 
> > non_interval_query = 
> session.query(*columns).select_from(*primary_tables) 
> > 
> > for alias_name, alias in metadata_alias.items(): 
> > object_type = self._get_object_type_from_metadata_name(
> alias.original.name) 
> > non_interval_query = ( 
> > non_interval_query 
> > .join( 
> > alias, 
> > getattr(alias.c, "{}_id".format(object_type)) == 
> METADATA_PRIMARY_TABLE_MAP[alias.original.name].c.id 
> > ) 
> > ) 
> > 
> > non_interval_query = 
> non_interval_query.subquery("non_interval_query") 
> > 
> > return non_interval_query 
> > 
> > 
> > 
> > The "metadata_alias" values are [('post_metadata_1", alias), 
> ('post_metadata_2', alias)] - the alias correspond to the post_metadata_1 
> and post_metadata_2 alias in your example. However, when I reference these 
> in the join, the aliased table names are not "post_metadata_1" or 
> "post_metadata_2" 

[sqlalchemy] Re: Unable to reference label in subquery at top level of query

2019-02-07 Thread Ian Miller
Ah - this occurs because the non-metadata_value columns added in from "query
.c._all_columns" are still referencing the original query. I need to figure 
out how to rebuild those columns as well so that they correctly reference 
the clauses in the new query.

On Tuesday, February 5, 2019 at 11:51:25 AM UTC-5, Ian Miller wrote:
>
> Hello all - 
>
> I am relatively new to using SQLAlchemy for more complex use cases. I am 
> in the process of creating a time series query, but I am unable to 
> reference a column by its alias at the top level of the query.
>
> This is the query that I am trying to address that SQLAlchemy is currently 
> generating:
>
> SELECT non_interval_query.metadata_value AS 
> non_interval_query_metadata_value,
>coalesce(sum(non_interval_query.coalesce_2), 0) AS coalesce_1,
>timestamp
> FROM
>   (SELECT generate_series(date_trunc('day', date(
> '2019-01-06T00:00:00+00:00')), date_trunc('day', date(
> '2019-01-12T00:00:00+00:00')), '1 day') AS timestamp) AS time_series
> LEFT OUTER JOIN
>   (SELECT post_metadata_1.metadata_value AS post_metadata_1_metadata_value
> ,
>   post_metadata_2.metadata_value AS post_metadata_2_metadata_value
> ,
>   vw_post.created_at AS vw_post_created_at,
>   coalesce(count(DISTINCT vw_post.id), 0) AS coalesce_1
>FROM vw_post
>JOIN post_metadata AS post_metadata_1 ON post_metadata_1.post_id = 
> vw_post.id
>JOIN post_metadata AS post_metadata_2 ON post_metadata_2.post_id = 
> vw_post.id
>WHERE post_metadata_1.metadata_value IN ('')
>  AND post_metadata_2.metadata_value IN ('')
>  AND vw_post.created_at >= '2019-01-06T00:00:00+00:00'
>  AND vw_post.created_at <= '2019-01-12T00:00:00+00:00'
>  AND post_metadata_1.schema_uid = ''
>  AND post_metadata_1.metadata_name = ''
>  AND post_metadata_2.schema_uid = ''
>  AND post_metadata_2.metadata_name = ''
>  AND vw_post.license_id IN ()
>GROUP BY vw_post.created_at,
> post_metadata_1.metadata_value,
> post_metadata_2.metadata_value,
> vw_post.created_at) AS non_interval_query ON date_trunc('day', 
> created_at) = timestamp;
>
> You'll notice that "non_interval_query.metadata_value AS 
> non_interval_query_metadata_value" specified at the beginning of the 
> query is ambiguous due to the 2 "metadata_value" selects in the 
> "non_interval_query" subquery. What I'm trying to do is have 2 selects at 
> the top level - one for "non_interval_query.post_metadata_1_metadata_value" 
> and one for "non_interval_query.post_metadata_2_metadata_value".
>
>
> For reference, here is the code used to generate the above query:
>
>
> def apply_date_group_by(self, session, query, range_gb_params):
> field_name = self.db.get("column")
> model = self._object.get("model")
>
> if not field_name or not model:
> raise ValueError("Invalid date group by")
>
> gb_column = self._build_column()
> interval = range_gb_params.get("interval")
> interval_type = range_gb_params.get("interval_type")
>
> time_series = func.generate_series(
> func.date_trunc(interval_type, func.date(range_gb_params["start"])),
> func.date_trunc(interval_type, func.date(range_gb_params["end"])),
> interval,
> ).label("timestamp")
>
> ts_column = column("timestamp")
>
> time_series_query = session.query(time_series).subquery("time_series")
> non_interval_query = query.subquery("non_interval_query")
> # have to replace the original gb_column with the 'timestamp' column
> # in order to properly merge the dataset into the time series dataset
> non_gb_columns, gbs = self._prepare_non_gb_columns(
> ts_column, gb_column, non_interval_query.columns
> )
>
> # construct query with correct position passed in from `range_gb_params`
> query_position = range_gb_params.get("query_index_position", 0)
> non_gb_columns.insert(query_position, ts_column)
>
> date_gb_query = session.query(*non_gb_columns).select_from(
> time_series_query.outerjoin(
> non_interval_query,
> func.date_trunc(interval_type, column(field_name)) == ts_column,
> )
> )
>
> if gbs:
> date_gb_query = date_gb_query.group_by(*gbs)
>
> return date_gb_query.order_by(ts_column)
>
>
>
> Any help on this would be greatly appreciated! 
>

-- 
SQLAlchemy - 
The Python SQL Toolkit and Object Relational Mapper

http://www.sqlalchemy.org/

To

[sqlalchemy] Re: Unable to reference label in subquery at top level of query

2019-02-07 Thread Ian Miller
Hello Mike,

Thank you for your response! I have currently constructed the ORM 
implementation that you suggested in your response. Here's the code:

def _rebuild_non_interval_query_for_group_by(self, session, query):
from sqlalchemy import table, column, select
from sqlalchemy.orm import aliased
from collections import defaultdict, OrderedDict

post_metadata = table(
"post_metadata", column("post_id"), column("metadata_value")
)
campaign_metadata = table(
"campaign_metadata", column("campaign_id"), column("metadata_value")
)
asset_metadata = table(
"asset_metadata", column("asset_id"), column("metadata_value")
)

vw_asset = table("vw_asset", column("id"))
vw_campaign = table("vw_campaign", column("id"))
vw_post = table("vw_post", column("id"))

METADATA_PRIMARY_TABLE_MAP = {
asset_metadata.name: vw_asset,
campaign_metadata.name: vw_campaign,
post_metadata.name: vw_post,
}

METADATA_NAME_TABLE_MAP = {
asset_metadata.name: asset_metadata,
campaign_metadata.name: campaign_metadata,
post_metadata.name: post_metadata,
}

primary_tables = set()
metadata_columns_count = defaultdict(int)
metadata_alias = OrderedDict()
columns = []
for c in query.c._all_columns:
if c.name == "metadata_value":
parent_column = list(c.base_columns)[0]
table = parent_column.table
primary_tables.add(METADATA_PRIMARY_TABLE_MAP[table.name])
metadata_columns_count[METADATA_NAME_TABLE_MAP[table.name]] += 1
alias_number = metadata_columns_count[METADATA_NAME_TABLE_MAP[
table.name]]
alias_name = "{}_{}".format(table.name, alias_number)
alias = aliased(parent_column.table, alias_name)
metadata_alias[alias_name] = alias
column = alias.c.metadata_value.label(
"{}_{}_{}".format(table.name, alias_number, "metadata_value"
)
)
columns.append(column)
else:
columns.append(c)

# start constructing query
non_interval_query = session.query(*columns).select_from(*primary_tables
)

for alias_name, alias in metadata_alias.items():
object_type = self._get_object_type_from_metadata_name(alias.
original.name)
non_interval_query = (
non_interval_query
.join(
alias,
getattr(alias.c, "{}_id".format(object_type)) == 
METADATA_PRIMARY_TABLE_MAP[alias.original.name].c.id
)
)

non_interval_query = non_interval_query.subquery("non_interval_query")

return non_interval_query



The "metadata_alias" values are [('post_metadata_1", alias), 
('post_metadata_2', alias)] - the alias correspond to the post_metadata_1 
and post_metadata_2 alias in your example. However, when I reference these 
in the join, the aliased table names are not "post_metadata_1" or 
"post_metadata_2" - they're "post_metadata_3" and "post_metadata_4". I'm 
unable to figure out why there's a new join seemingly created instead of 
referencing the aliased tables that were passed in.

Here's the query that the above generates:

SELECT post_metadata_1.metadata_value AS post_metadata_1_metadata_value,
   post_metadata_2.metadata_value AS post_metadata_2_metadata_value,
   non_interval_query.created_at,
   non_interval_query.coalesce_1 \nFROM
  (SELECT post_metadata_3.metadata_value AS metadata_value, 
post_metadata_4.metadata_value 
AS metadata_value, vw_post.created_at AS created_at, coalesce(count(DISTINCT 
vw_post.id), :coalesce_2) AS coalesce_1 \nFROM vw_post
   JOIN post_metadata AS post_metadata_3 ON post_metadata_3.post_id = 
vw_post.id
   JOIN post_metadata AS post_metadata_4 ON post_metadata_4.post_id = 
vw_post.id \nWHERE post_metadata_3.metadata_value IN (:metadata_value_1, :
metadata_value_2)
   AND post_metadata_4.metadata_value IN (:metadata_value_3, :
metadata_value_4)
   AND vw_post.created_at >= :created_at_1
   AND vw_post.created_at <= :created_at_2
   AND post_metadata_3.schema_uid = :schema_uid_1
   AND post_metadata_3.metadata_name = :metadata_name_1
   AND post_metadata_4.schema_uid = :schema_uid_2
   AND post_metadata_4.metadata_name = :metadata_name_2
   AND vw_post.license_id IN (:license_id_1, :license_id_2)
   GROUP BY vw_post.created_at, post_metadata_3.metadata_value, 
post_metadata_4.metadata_value, vw_post.created_at) AS non_interval_query,
  vw_post
JOIN post_metadata AS post_metadata_1 ON post_metadata_1.post_id = vw_post.
id
JOIN post_metadata AS post_metadata_2 ON post_metadata_2.post_id = vw_post.
id;




On

[sqlalchemy] Unable to reference label in subquery at top level of query

2019-02-05 Thread Ian Miller
Hello all - 

I am relatively new to using SQLAlchemy for more complex use cases. I am in 
the process of creating a time series query, but I am unable to reference a 
column by its alias at the top level of the query.

This is the query that I am trying to address that SQLAlchemy is currently 
generating:

SELECT non_interval_query.metadata_value AS 
non_interval_query_metadata_value,
   coalesce(sum(non_interval_query.coalesce_2), 0) AS coalesce_1,
   timestamp
FROM
  (SELECT generate_series(date_trunc('day', date('2019-01-06T00:00:00+00:00'
)), date_trunc('day', date('2019-01-12T00:00:00+00:00')), '1 day') AS 
timestamp) AS time_series
LEFT OUTER JOIN
  (SELECT post_metadata_1.metadata_value AS post_metadata_1_metadata_value,
  post_metadata_2.metadata_value AS post_metadata_2_metadata_value,
  vw_post.created_at AS vw_post_created_at,
  coalesce(count(DISTINCT vw_post.id), 0) AS coalesce_1
   FROM vw_post
   JOIN post_metadata AS post_metadata_1 ON post_metadata_1.post_id = 
vw_post.id
   JOIN post_metadata AS post_metadata_2 ON post_metadata_2.post_id = 
vw_post.id
   WHERE post_metadata_1.metadata_value IN ('')
 AND post_metadata_2.metadata_value IN ('')
 AND vw_post.created_at >= '2019-01-06T00:00:00+00:00'
 AND vw_post.created_at <= '2019-01-12T00:00:00+00:00'
 AND post_metadata_1.schema_uid = ''
 AND post_metadata_1.metadata_name = ''
 AND post_metadata_2.schema_uid = ''
 AND post_metadata_2.metadata_name = ''
 AND vw_post.license_id IN ()
   GROUP BY vw_post.created_at,
post_metadata_1.metadata_value,
post_metadata_2.metadata_value,
vw_post.created_at) AS non_interval_query ON date_trunc('day', 
created_at) = timestamp;

You'll notice that "non_interval_query.metadata_value AS 
non_interval_query_metadata_value" specified at the beginning of the query 
is ambiguous due to the 2 "metadata_value" selects in the 
"non_interval_query" subquery. What I'm trying to do is have 2 selects at 
the top level - one for "non_interval_query.post_metadata_1_metadata_value" 
and one for "non_interval_query.post_metadata_2_metadata_value".


For reference, here is the code used to generate the above query:


def apply_date_group_by(self, session, query, range_gb_params):
field_name = self.db.get("column")
model = self._object.get("model")

if not field_name or not model:
raise ValueError("Invalid date group by")

gb_column = self._build_column()
interval = range_gb_params.get("interval")
interval_type = range_gb_params.get("interval_type")

time_series = func.generate_series(
func.date_trunc(interval_type, func.date(range_gb_params["start"])),
func.date_trunc(interval_type, func.date(range_gb_params["end"])),
interval,
).label("timestamp")

ts_column = column("timestamp")

time_series_query = session.query(time_series).subquery("time_series")
non_interval_query = query.subquery("non_interval_query")
# have to replace the original gb_column with the 'timestamp' column
# in order to properly merge the dataset into the time series dataset
non_gb_columns, gbs = self._prepare_non_gb_columns(
ts_column, gb_column, non_interval_query.columns
)

# construct query with correct position passed in from `range_gb_params`
query_position = range_gb_params.get("query_index_position", 0)
non_gb_columns.insert(query_position, ts_column)

date_gb_query = session.query(*non_gb_columns).select_from(
time_series_query.outerjoin(
non_interval_query,
func.date_trunc(interval_type, column(field_name)) == ts_column,
)
)

if gbs:
date_gb_query = date_gb_query.group_by(*gbs)

return date_gb_query.order_by(ts_column)



Any help on this would be greatly appreciated! 

-- 
SQLAlchemy - 
The Python SQL Toolkit and Object Relational Mapper

http://www.sqlalchemy.org/

To post example code, please provide an MCVE: Minimal, Complete, and Verifiable 
Example.  See  http://stackoverflow.com/help/mcve for a full description.
--- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at https://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.


[sqlalchemy] Re: Support for native PostgreSQL "polygon" type?

2015-09-05 Thread Ian McCullough
Is there some compelling reason you wouldn't just install the PostGIS 
extensions? Assuming there is...

You could use a TEXT column to store WKT values or a BLOB/bytea column to 
store WKB representations, and perhaps use hybrid_attributes 
<http://docs.sqlalchemy.org/en/rel_1_0/orm/extensions/hybrid.html> to 
manage the conversion, but realize that because (without PostGIS) the DB 
doesn't semantically understand these types, so you won't be able to filter 
against them (other than IS [NOT] NULL) or use them in any meaningful SQL 
expressions. You'll likely be limited to CRUD capabilities. 

Actually, now that I think about it, since you won't be able to use them in 
query expressions anyway, there's probably no reason to go to the extra 
effort of hybrid properties; just make a pure python property to do the 
conversion between the mapped TEXT|BLOB column and the Geometry types.

But again, trying to use geo types in PostgreSQL without PostGIS just kinda 
sounds like self-flagellation. :)

Ian
 


On Friday, September 4, 2015 at 6:59:29 PM UTC-4, Demitri Muna wrote:
>
> Hi,
>
> Is there a way to use the native PostgreSQL (i.e. not postgis) "polygon" 
> data type through SQLAlchemy? I'm guessing one might be able to define the 
> model class via autoload as normal, then add some kind of custom column 
> definition? Surely someone has done this before, but I haven't been able to 
> find an example.
>
> I've taken a look at geoalchemy and tried to implement that, but was 
> getting this error:
>
> ProgrammingError: (psycopg2.ProgrammingError) function 
> st_asbinary(polygon) does not exist
> LINE 1: sdss_frame.filename AS muna_sdss_frame_filename, ST_AsBinar...
>  ^
> HINT:  No function matches the given name and argument types. You might 
> need to add explicit type casts.
>
> If at all possible, I'd prefer to use the native data type, even if it 
> requires a custom adaptor. (Again, surely someone has written this?)
>
> Cheers,
> Demitri
>

-- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.


Sharing metadata between alembic and SQLAlchemy

2015-08-01 Thread Ian McCullough
I've been getting up to speed with SQLAlchemy and alembic and having a 
great time of it. This is some great stuff!

One thing that's been confounding me is this: My Alembic schema revisions 
are 'authoritative' for my metadata (i.e. I've started from scratch using 
alembic to build the schema from nothing), yet it doesn't appear that the 
metadata that exists in my alembic scripts can be leveraged by my models in 
my main app. So far, I've been maintaining effectively two versions of the 
metadata, one in the form of the flattened projection of my alembic 
schema rev scripts, and another in my application models scripts. I 
understand that there are some facilities to auto-re-generate the metadata 
from the RDBMS on the application side, but that seems potentially lossy, 
or at least subject to the whims of whatever DBAPI provider I'm using.

Is there a way to pull this flattened projection of metadata out of alembic 
and into my app's models at runtime? (i.e. import alembic, read the version 
from the live DB, then build the metadata by playing the upgrade scripts 
forward, not against the database, but against a metadata instance?) It 
seems like a fool's errand to try to keep my app models in sync with the 
flattened projection of the schema revisions by hand. My assumption is that 
I'm missing something super-obvious here.

Thanks,
Ian


-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy-alembic group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy-alembic+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[sqlalchemy] Python 3.4 Enum type

2013-08-06 Thread Ian Kelly
Since Python 3.4 is adding support for enums to the standard library,
I wrote a TypeDecorator for it:

import sqlalchemy.types as types


class PythonEnum(types.TypeDecorator):

impl = types.Enum

def __init__(self, enum_class, **kw):
super().__init__(*(m.name for m in enum_class), **kw)
self._enum_class = enum_class

def process_bind_param(self, value, dialect):
return value.name

def process_result_value(self, value, dialect):
return self._enum_class[value]

@property
def python_type(self):
return self._enum_class


Comments welcome.  Is this something that should be added to sqlalchemy?

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/groups/opt_out.




Re: [sqlalchemy] between .one() and .first()

2013-06-18 Thread ian marcinkowski
From the docs:
one()

Return exactly one result or raise an exception.

Raises sqlalchemy.orm.exc.NoResultFound if the query selects no rows.

Raises sqlalchemy.orm.exc.MultipleResultsFound if multiple object
identities are returned, or if multiple rows are returned for a query
that does not return object identities.

Thus, you could:

try:
  return query.one()
  print ('Yay! One result!')
except NoResultFound:
  # Deal with case of zero results
  print('Zero results is also good!')
except MultipleResultsFound:
  # Deal with case of 1 results
  print('This should not happen. :(')



On Mon, Jun 17, 2013 at 1:57 PM, Petr Viktorin encu...@gmail.com wrote:
 Simply handling NoResultFound should work just fine...

 def zero_or_one(query):
 try:
 return query.one()
 except NoResultFound:
 return None

 On Mon, Jun 17, 2013 at 6:36 PM, Michael Bayer mike...@zzzcomputing.com 
 wrote:

 On Jun 17, 2013, at 5:33 AM, Wichert Akkerman wich...@wiggy.net wrote:


 On Jun 17, 2013, at 08:58 , Chris Withers ch...@simplistix.co.uk wrote:

 Hi All,

 I seems to commonly need to do a query which should return zero or one 
 mapped objects.

 .one() isn't what I want as no returned object is ok.

 .first() isn't what I want as if my query would return more than one 
 object, I have the query wrong and so want an exception.

 Is there something already available that meets this need?

 This will requrie you to run a query which limits the result to max 2 rows 
 so you can check if more than one result would be returned. I would just 
 create a simple wrapper function:


 def one_optional(query):
rows = query.limit(2).all()
count = len(rows)
if count == 0:
return None
elif count == 1:
return rows[0]
else:
raise RuntimeError('More than one result found.')

 you could even drop the limit(2) so that the query renders more simply if 
 eagerloads are present.

 --
 You received this message because you are subscribed to the Google Groups 
 sqlalchemy group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to sqlalchemy+unsubscr...@googlegroups.com.
 To post to this group, send email to sqlalchemy@googlegroups.com.
 Visit this group at http://groups.google.com/group/sqlalchemy.
 For more options, visit https://groups.google.com/groups/opt_out.



 --
 You received this message because you are subscribed to the Google Groups 
 sqlalchemy group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to sqlalchemy+unsubscr...@googlegroups.com.
 To post to this group, send email to sqlalchemy@googlegroups.com.
 Visit this group at http://groups.google.com/group/sqlalchemy.
 For more options, visit https://groups.google.com/groups/opt_out.





-- 
Ian Marcinkowski
ianmarcinkow...@gmail.com

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/groups/opt_out.




Re: [sqlalchemy] Tea Rota

2012-01-13 Thread Ian Kelly
On Thu, Jan 12, 2012 at 3:14 PM, Calum MacLeod pan...@sky.com wrote:
     if date not in duties:
     duty = Duty(date=date)
     duties[date] = duty

     if name not in volunteers:
     volunteer = Volunteer(fore=fore, surn=surn, name=name)
     volunteers[name] = volunteer
     volunteer.duties.append(duty)

If the date or the volunteer are already in the respective dicts, you
never assign the 'duty' or 'volunteer' variable.  Thus it would retain
its value from the last iteration.  This is probably why you seem to
be missing volunteers -- some of your dict entries are getting
clobbered with the wrong volunteers.

Cheers,
Ian

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalchemy@googlegroups.com.
To unsubscribe from this group, send email to 
sqlalchemy+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en.



[sqlalchemy] subqueryload problem

2011-11-10 Thread Ian Wilson
 AS
sights_1_image_id
FROM (SELECT tours.id AS tours_id
FROM tours JOIN users ON tours.owner_id = users.id
WHERE tours.resource_state = %(resource_state_1)s AND
users.resource_state = %(resource_state_2)s AND tours.published = %
(published_1)s ORDER BY tours.last_changed_on DESC
 LIMIT %(param_1)s OFFSET %(param_2)s) AS anon_1 JOIN sights AS
sights_1 ON anon_1.tours_id = sights_1.tour_id AND
sights_1.parent_sight_id IS NULL AND sights_1.precedence = %
(precedence_1)s JOIN images ON images.id = sights_1.image_id AND
sights_1.tour_id = tours.id AND tours.resource_state != %
(resource_state_3)s ORDER BY sights_1.image_id
2011-11-10 11:45:22,122 INFO  [sqlalchemy.engine.base.Engine]
[MainThread] {'precedence_1': 0, 'param_1': 101, 'param_2': 0,
'resource_state_3': 3, 'resource_state_2': 1, 'resource_state_1': 1,
'published_1': True}
Traceback (most recent call last):
  File console, line 1, in module
  File /home/ian/workspace/gardentheory/gardentheory/ve/lib/python2.7/
site-packages/sqlalchemy/orm/query.py, line 1922, in all
return list(self)
  File /home/ian/workspace/gardentheory/gardentheory/ve/lib/python2.7/
site-packages/sqlalchemy/orm/query.py, line 2153, in instances
rows = [process[0](row, None) for row in fetch]
  File /home/ian/workspace/gardentheory/gardentheory/ve/lib/python2.7/
site-packages/sqlalchemy/orm/mapper.py, line 2528, in _instance
eager_populators
  File /home/ian/workspace/gardentheory/gardentheory/ve/lib/python2.7/
site-packages/sqlalchemy/orm/mapper.py, line 2711, in _populators
self, row, adapter)):
  File /home/ian/workspace/gardentheory/gardentheory/ve/lib/python2.7/
site-packages/sqlalchemy/orm/interfaces.py, line 338, in
create_row_processor
reduced_path, mapper, row, adapter)
  File /home/ian/workspace/gardentheory/gardentheory/ve/lib/python2.7/
site-packages/sqlalchemy/orm/strategies.py, line 894, in
create_row_processor
lambda x:x[1:]
  File /home/ian/workspace/gardentheory/gardentheory/ve/lib/python2.7/
site-packages/sqlalchemy/orm/strategies.py, line 891, in genexpr
(k, [v[0] for v in v])
  File /home/ian/workspace/gardentheory/gardentheory/ve/lib/python2.7/
site-packages/sqlalchemy/orm/query.py, line 2156, in instances
labels) for row in fetch]
  File /home/ian/workspace/gardentheory/gardentheory/ve/lib/python2.7/
site-packages/sqlalchemy/orm/mapper.py, line 2528, in _instance
eager_populators
  File /home/ian/workspace/gardentheory/gardentheory/ve/lib/python2.7/
site-packages/sqlalchemy/orm/mapper.py, line 2711, in _populators
self, row, adapter)):
  File /home/ian/workspace/gardentheory/gardentheory/ve/lib/python2.7/
site-packages/sqlalchemy/orm/interfaces.py, line 338, in
create_row_processor
reduced_path, mapper, row, adapter)
  File /home/ian/workspace/gardentheory/gardentheory/ve/lib/python2.7/
site-packages/sqlalchemy/orm/strategies.py, line 894, in
create_row_processor
lambda x:x[1:]
  File /home/ian/workspace/gardentheory/gardentheory/ve/lib/python2.7/
site-packages/sqlalchemy/orm/query.py, line 2032, in __iter__
return self._execute_and_instances(context)
  File /home/ian/workspace/gardentheory/gardentheory/ve/lib/python2.7/
site-packages/sqlalchemy/orm/query.py, line 2047, in
_execute_and_instances
result = conn.execute(querycontext.statement, self._params)
  File /home/ian/workspace/gardentheory/gardentheory/ve/lib/python2.7/
site-packages/sqlalchemy/engine/base.py, line 1399, in execute
params)
  File /home/ian/workspace/gardentheory/gardentheory/ve/lib/python2.7/
site-packages/sqlalchemy/engine/base.py, line 1532, in
_execute_clauseelement
compiled_sql, distilled_params
  File /home/ian/workspace/gardentheory/gardentheory/ve/lib/python2.7/
site-packages/sqlalchemy/engine/base.py, line 1640, in
_execute_context
context)
  File /home/ian/workspace/gardentheory/gardentheory/ve/lib/python2.7/
site-packages/sqlalchemy/engine/base.py, line 1633, in
_execute_context
context)
  File /home/ian/workspace/gardentheory/gardentheory/ve/lib/python2.7/
site-packages/sqlalchemy/engine/default.py, line 330, in do_execute
cursor.execute(statement, parameters)
ProgrammingError: (ProgrammingError) missing FROM-clause entry for
table tours
LINE 5: ...ges.id = sights_1.image_id AND sights_1.tour_id = tours.id
A...
 ^
 'SELECT images.id AS images_id, images.url AS images_url,
images.title AS images_title, images.height AS images_height,
images.width AS images_width, images.owner_id AS images_owner_id,
images.resource_state AS images_resource_state, images.system AS
images_system, images.created_on AS images_created_on,
images.modified_on AS images_modified_on, sights_1.image_id AS
sights_1_image_id \nFROM (SELECT tours.id AS tours_id \nFROM tours
JOIN users ON tours.owner_id = users.id \nWHERE tours.resource_state =
%(resource_state_1)s AND users.resource_state = %(resource_state_2)s
AND tours.published = %(published_1)s ORDER BY tours.last_changed_on
DESC \n LIMIT

Re: [sqlalchemy] Oracle Connection Oddities

2011-07-14 Thread Ian Kelly
On Thu, Jul 14, 2011 at 12:53 PM, Burhan burhan.kha...@gmail.com wrote:
 I am not sure what version of cx_Oracle it is - it was downloaded as a
 Windows binary - the latest version is 5.1 on the cx_Oracle download page.

 import cx_Oracle
 print cx_Oracle.version

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalchemy@googlegroups.com.
To unsubscribe from this group, send email to 
sqlalchemy+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en.



[sqlalchemy] Re: Trouble detaching all objects with relationships intact

2010-12-07 Thread Ian Thompson
On Dec 6, 5:53 pm, Michael Bayer mike...@zzzcomputing.com wrote:
 On Dec 6, 2010, at 11:52 AM, Ian Thompson wrote:





  On Dec 6, 4:49 pm, Ian Thompson quorn...@gmail.com wrote:
  I've generated a test data set and then want to have all of the
  created objects and relationships available for reference easily.
  To avoid possible pollution of this fixture I want to detach the
  objects from the sessions to avoid them getting any updates applied.

  To do this I am attempting to load all data via joinedload and then
  calling session.expunge_all:

      print assets['test_asset_1'].attributes # OK

  session.query(Asset).options(joinedload(Asset.asset_attributes)).populate_e
   xisting().all()
      session.expunge_all()
      print assets['test_asset_1'].attributes # ERROR

  None of the relationships on the objects survive the expunge.

      sqlalchemy.orm.exc.DetachedInstanceError:
      Parent instance Asset at ... is not bound to a Session;
      lazy load operation of attribute 'asset_attributes' cannot proceed
  (cont...)
  Is there a correct way to detach all current data from the Session
  fully loading any relationships? (Small data set so memory is not a
  concern.)

 This use case seems like it would be more easily solved using a new Session 
 just for that load.   To work with the objects detached, all deferred 
 attributes and lazy-loading relationship attributes which you will need to 
 access would need to either be eagerly loaded (there's now three ways to 
 eagerly load relationship() attributes), or otherwise accessed via 
 obj.attribute before the objects are expunged.

Hi Michael, thanks for your reply.

I had thought by using joinedload (or eagerload) and populate_existing
I would be doing an eager load of the specified relation.

Also, in my example I do access the attributes relationship (with a
print), then after the expunge the same fails. Is there a way I can
ensure the data is preserved after expunging?

Thanks
Ian

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalch...@googlegroups.com.
To unsubscribe from this group, send email to 
sqlalchemy+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en.



[sqlalchemy] Joinedload with association proxies

2010-12-06 Thread Ian Thompson
Hi,

Apologies if this has been answered or doesn't make sense, I'm still
just getting started here.

Passing in an association proxy to the joinedload function doesn't
seem to work, I have to use the original name. Is this the expected
behaviour or am I doing something stupid?

class SomeTable(Base):
blahs = association_proxy('foo_blahs', 'blah', creator=...,
getset_factory=...)

This doesn't work:
session.query(SomeTable).options(joinedload(SomeTable.blahs))

Have to do this:
session.query(SomeTable).options(joinedload(SomeTable.foo_blahs))

Thanks (and Hi)

Ian

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalch...@googlegroups.com.
To unsubscribe from this group, send email to 
sqlalchemy+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en.



[sqlalchemy] Trouble detaching all objects with relationships intact

2010-12-06 Thread Ian Thompson
I've generated a test data set and then want to have all of the
created objects and relationships available for reference easily.
To avoid possible pollution of this fixture I want to detach the
objects from the sessions to avoid them getting any updates applied.

To do this I am attempting to load all data via joinedload and then
calling session.expunge_all:

print assets['test_asset_1'].attributes # OK
 
session.query(Asset).options(joinedload(Asset.asset_attributes)).populate_existing().all()
session.expunge_all()
print assets['test_asset_1'].attributes # ERROR

None of the relationships on the objects survive the expunge.

sqlalchemy.orm.exc.DetachedInstanceError:
Parent instance Asset at ... is not bound to a Session;
lazy load operation of attribute 'asset_attributes' cannot proceed

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalch...@googlegroups.com.
To unsubscribe from this group, send email to 
sqlalchemy+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en.



[sqlalchemy] Re: Trouble detaching all objects with relationships intact

2010-12-06 Thread Ian Thompson
On Dec 6, 4:49 pm, Ian Thompson quorn...@gmail.com wrote:
 I've generated a test data set and then want to have all of the
 created objects and relationships available for reference easily.
 To avoid possible pollution of this fixture I want to detach the
 objects from the sessions to avoid them getting any updates applied.

 To do this I am attempting to load all data via joinedload and then
 calling session.expunge_all:

     print assets['test_asset_1'].attributes # OK

 session.query(Asset).options(joinedload(Asset.asset_attributes)).populate_e 
 xisting().all()
     session.expunge_all()
     print assets['test_asset_1'].attributes # ERROR

 None of the relationships on the objects survive the expunge.

     sqlalchemy.orm.exc.DetachedInstanceError:
     Parent instance Asset at ... is not bound to a Session;
     lazy load operation of attribute 'asset_attributes' cannot proceed
(cont...)
Is there a correct way to detach all current data from the Session
fully loading any relationships? (Small data set so memory is not a
concern.)

Ian

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalch...@googlegroups.com.
To unsubscribe from this group, send email to 
sqlalchemy+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en.



Re: [sqlalchemy] group by and Oracle

2010-12-03 Thread Ian Kelly
On Fri, Dec 3, 2010 at 4:08 AM, jo jose.soa...@sferacarta.com wrote:
 Hi all,

 I'm trying to write a GROUP BY query grouped by a function (to_char) using a
 variable format, which could be 'yy' or ''
 as in:

 sql=session.query(
   func.to_char(Prestazione.c.data,format),
   func.sum(Prestazione.c.quantita).label('quantita'),
   func.sum(Prestazione.c.importo).label('importo')
   )
 sql=sql.filter(Verifica.c.codice == Tariffa.c.codice)
 sql=sql.filter(Prestazione.c.id_tariffa == Tariffa.c.id)
 sql=sql.group_by(Verifica.c.codice, func.to_char(Prestazione.c.data,format))

Have you tried using the same func result in both places, i.e.:

to_char = func.to_char(Prestazione.c.data,format)

sql=session.query(
  to_char,
  func.sum(Prestazione.c.quantita).label('quantita'),
  func.sum(Prestazione.c.importo).label('importo')
  )
sql=sql.filter(Verifica.c.codice == Tariffa.c.codice)
sql=sql.filter(Prestazione.c.id_tariffa == Tariffa.c.id)
sql=sql.group_by(Verifica.c.codice, to_char)

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalch...@googlegroups.com.
To unsubscribe from this group, send email to 
sqlalchemy+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en.



Re: [sqlalchemy] new LIMIT/OFFSET Support for oracle - huge speed penalty

2010-06-23 Thread Ian Kelly
On Wed, Jun 23, 2010 at 1:50 AM, Ralph Heinkel ralph.hein...@web.de wrote:
 Hi,

 we are about upgrading our sqlalchemy library from 0.4.8 to something newer
 and during this process we have detected that the LIMIT/OFFSET support for
 oracle has been changed, from using  “ROW NUMBER OVER...” to a wrapped
 subquery approach in conjunction with ROWNUM as described in
 http://www.sqlalchemy.org/docs/reference/dialects/oracle.html#limit-offset-support


 Unfortunately this approch is about 10 times slower for large tables which
 is mainly related to the fact that the innermost subquery has to sort the
 entire table with a plain 'order by'.
 Interestingly the
        ROW_NUMBER() OVER (ORDER BY some db fields)
 is so much more efficient than the normal order by approach.

Do you have benchmarks to back that up?  In Django, we switched from
using row_number to rownum after a contributor convinced me that
rownum was faster.  See:

http://code.djangoproject.com/ticket/9136

Thanks,
Ian

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalch...@googlegroups.com.
To unsubscribe from this group, send email to 
sqlalchemy+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en.



Re: [sqlalchemy] Re: use_ansi oracle sysdate vs. current_date

2010-04-02 Thread Ian Kelly
On Fri, Apr 2, 2010 at 3:41 PM, Michael Bayer mike...@zzzcomputing.comwrote:

 Kent wrote:
  Along the same lines, is there something we can do about nvl()
  (oracle) versus coalesce() (ansi)?
 
  They aren't exactly the same, unfortunately (nvl takes exactly 2
  arguments, no more), so maybe there is nothing 'official' you can do,
  but can you help me work it out for my project?
 
  I assume it is very similar to what you helped me out with
  above...something like this:
 
  
  from sqlalchemy.sql.expression import ColumnElement
  from sqlalchemy.ext.compiler import compiles
  from sqlalchemy.types import DateTime, Date
 
  class current_date(ColumnElement):
  type = Date()
 
  @compiles(current_date)
  def _compiler_dispatch(element, compiler, **kw):
  if compiler.dialect.name == 'oracle':
  if compiler.dialect.use_ansi:
  return trunc(current_date)
  else:
  return trunc(sysdate)
  else:
  # assume postgres
  return current_date
  
 
 
  But the main difference is it takes arguments.
 
  Is there a clever way to return the appropriate function, something
  like this:
 
  def 
  if compiler.dialect.name == 'oracle':
  return func.nvl
  else:
  # assume postgres
  return func.coalesce

 I will add this to the docs:


 from sqlalchemy import *
 from sqlalchemy.ext.compiler import compiles

 from sqlalchemy.sql.expression import FunctionElement


 class coalesce(FunctionElement):
name = 'coalesce'

 @compiles(coalesce)
 def compile(element, compiler, **kw):
return coalesce(%s) % compiler.process(element.clauses)


 @compiles(coalesce, 'oracle')
 def compile(element, compiler, **kw):
if len(element.clauses)  2:
raise TypeError(coalesce only supports two arguments on Oracle)
return nvl(%s) % compiler.process(element.clauses)

 print coalesce(5, 6)

 from sqlalchemy.dialects import oracle
 print coalesce(5, 6).compile(dialect=oracle.dialect())


Might this work as a more complete solution for Oracle?

@compiles(coalesce, 'oracle')
def compile(element, compiler, **kw):
   sql = nvl(%s)
   for i in xrange(len(element.clauses) - 2):
  sql %= %s, nvl(%%s) % compiler.process(element.clauses[i:i+1])
   return sql % compiler.process(element.clauses[-2:])

Ian

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalch...@googlegroups.com.
To unsubscribe from this group, send email to 
sqlalchemy+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en.



[sqlalchemy] Joined table inheritance without a discriminator

2010-01-15 Thread Ian
All,

I have two applications: one uses Python with Sqlalchemy and the other
uses Java with Hibernate. There is a slight mis-match between the
joined table inheritance strategy: with Hibernate a discriminator is
not required.

The Sqlalchemy documentation says, in the Joined Table Inheritance
section (http://www.sqlalchemy.org/docs/05/mappers.html#joined-table-
inheritance):

While there are some “tricks” to work around the requirement that
there be a discriminator column, they are more complicated to
configure when one wishes to load polymorphically.

What are these tricks and where are they documented. I'm not really
all that interested in polymorphic querying in the Python application.

Thanks,

Ian Johnson
-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalch...@googlegroups.com.
To unsubscribe from this group, send email to 
sqlalchemy+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en.




Re: [sqlalchemy] Oracle timestamp and postgres datetime compatibility

2009-12-02 Thread Ian Kelly
On Wed, Dec 2, 2009 at 3:25 PM, Michael Bayer mike...@zzzcomputing.com wrote:
 from the article:

  Beware while the TO_CHAR function works with both datatypes, the TRUNC 
 function will not work with a datatype of TIMESTAMP. This is a clear 
 indication that the use of TIMESTAMP datatype should explicitly be used for 
 date and times where a difference in time is of utmost importance, such that 
 Oracle won't even let you compare like values.

 this suggests to me that DATE is more of a general purpose date/time type 
 whereas TIMESTAMP is specifically when you need granularity to compare the 
 ordering of events down to the millisecond, with some loss in simplicity .

This was fixed in version 9.2.0.3.0. Using that release or later,
TRUNC works just fine with TIMESTAMP.  See the thread at
http://forums.oracle.com/forums/thread.jspa?threadID=372457

I should stress that I don't think using DATE is a problem so long as
there's a way to get TIMESTAMP instead.  Especially if the goal is to
support Oracle 8i, where TIMESTAMP doesn't even exist.

Ian

--

You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalch...@googlegroups.com.
To unsubscribe from this group, send email to 
sqlalchemy+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en.




[sqlalchemy] Custom ORM inserts and updates

2009-10-22 Thread Ian

Is there any way to customize the mapper-generated queries used for
persisting objects?  My specific case is that I would like to have the
ORM call a stored procedure for inserts and updates rather than the
usual insert and update statements.

Thanks,
Ian
--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalchemy@googlegroups.com
To unsubscribe from this group, send email to 
sqlalchemy+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en
-~--~~~~--~~--~--~---



[sqlalchemy] Connection Pool behaviour change

2008-12-10 Thread Robert Ian Smit

Hi,

I am porting our pylons app from 0.4.5 to 0.5.

It appears that implicit queries are using a new connection instead of
re-using the existing one that is also used by the ORM Session.

I have read the Migration Docs and looked at the changelog and didn't
find anything related to this matter.

To avoid opening many connections or to re-use a transaction handle, I
had to rewrite code such as this

from:

   s = select()
   dbsession.query(Model).instances(s.execute())


to:

  s = select()
  list(dbsession.query(Model).instances(dbsession.execute(s)))

Is this change expected or is it an indication of an error somewhere?

Thanks,

Bob

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalchemy@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en
-~--~~~~--~~--~--~---



[sqlalchemy] Removing aggregate function from query results

2008-11-10 Thread Ian Charnas

Hello Alchemy Land!

If I have a simple test-case with Bowler objects and City objects, and
I want to use func.max and group_by in order to find the highest
scorers in each city... I might do something like this:

max_score = func.max(Bowler.highscore).label('highest_score')
results = session.query(Bowler,
max_score).group_by(Bowler.city_id).all()

So this works as I'd expect, and 'results' now contains
(Bowler,max_score) tuples... but  what I really want is to have a
query that just returns Bowler objects, and not these tuples.  Is
there a way to get rid of that 'max_score' column from the result
set?  I've been at this for hours, I bet it's really simple but I just
can't find it.

please help!!!

Full example
--

# STANDARD BOILERPLATE
from sqlalchemy import *
from sqlalchemy.orm import *
engine = create_engine('sqlite:///:memory:', echo=True,
strategy='threadlocal')
Session = scoped_session(sessionmaker(autoflush=False,
autocommit=False))
session = Session(bind=engine)
metadata = ThreadLocalMetaData()
metadata.bind = engine

# DEFINE TABLES
bowlers_table = Table('bowler', metadata,
Column('bowler_id', Integer, primary_key=True),
Column('name', String(50)),
Column('highscore', Integer, default=0),
Column('city_id', None, ForeignKey('city.city_id'))
)
cities_table = Table('city', metadata,
Column('city_id', Integer, primary_key=True),
Column('name', String(50))
)
metadata.create_all()

# DEFINE CLASSES
class Base(object):
def __init__(self, **kwargs):
for key, value in kwargs.iteritems():
setattr(self, key, value)
class Bowler(Base):
pass
class City(Base):
pass

# MAP CLASSES
mapper(City, cities_table)
mapper(Bowler, bowlers_table, properties={
'city': relation(City)
})

# CREATE SAMPLE DATA
nyc = City(name=New York City)
michael_bayer = Bowler(name=Michael Bayer, highscore=299, city=nyc)
big_lebowski = Bowler(name=Jeffrey Lebowsky, highscore=170,
city=nyc)

cle = City(name=Cleveland)
ian_charnas = Bowler(name=Ian Charnas, highscore=220, city=cle)
the_jesus = Bowler(name=Antonio DeJesus, highscore=130, city=cle)

session.add_all([michael_bayer, big_lebowski, ian_charnas, the_jesus])
session.flush()

# GET HIGH SCORERS BY CITY
max_score = func.max(Bowler.highscore).label('highest_score')
results = session.query(Bowler,
max_score).group_by(Bowler.city_id).all()

# Results contain (Bowler, max_score) tuples
[(__main__.Bowler object at 0x139b590, 170),
 (__main__.Bowler object at 0x13b20d0, 130)]

# But I want results to just contain Bowler objects

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalchemy@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en
-~--~~~~--~~--~--~---



[sqlalchemy] Re: Removing aggregate function from query results

2008-11-10 Thread Ian Charnas

Simon, Michael, thank you!

Simon, yes you were totally right, my query was totally wrong!  I was
up all night trying to get some code working, and at 5am I was getting
a little fuzzy.  I'd like to use that as my excuse ;-)

What I ended up doing this morning was doing a simple query with max
and group_by that finds the highest score in each city, and then
joined that with Bowler.  The resulting SQL looks great.  Thanks for
your help!

max_score = func.max(Bowler.highscore).label('city_highscore')
city_highscores = session.query(Bowler.city_id,
max_score).group_by(Bowler.city_id).subquery()

best_bowlers = session.query(Bowler).join(
(highest_scores,
and_(
Bowler.city_id==city_highscores.c.city_id,
Bowler.highscore==city_highscores.c.city_highscore
)
)
)

Hope this helps someone searching on the list!
-Ian

On Nov 10, 10:27 am, Michael Bayer [EMAIL PROTECTED] wrote:
 On Nov 10, 2008, at 5:35 AM, Ian Charnas wrote:





  Hello Alchemy Land!

  If I have a simple test-case with Bowler objects and City objects, and
  I want to use func.max and group_by in order to find the highest
  scorers in each city... I might do something like this:

  max_score = func.max(Bowler.highscore).label('highest_score')
  results = session.query(Bowler,
  max_score).group_by(Bowler.city_id).all()

  So this works as I'd expect, and 'results' now contains
  (Bowler,max_score) tuples... but  what I really want is to have a
  query that just returns Bowler objects, and not these tuples.  Is
  there a way to get rid of that 'max_score' column from the result
  set?  I've been at this for hours, I bet it's really simple but I just
  can't find it.

 hey Ian -

 any chance you can just use the func.max() at the end of the Query  
 using the values() method ?  that way its just an ad-hoc thing.    
 Otherwise there's no official way to remove an entity from an  
 existing Query's list of entities.
--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalchemy@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en
-~--~~~~--~~--~--~---



[sqlalchemy] What am I missing?

2007-07-19 Thread Ian Wilson

Can anyone explain why this http://paste.turbogears.org/paste/1510
fails at the last assert but this
http://paste.turbogears.org/paste/1511 works ?

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalchemy@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en
-~--~~~~--~~--~--~---



[sqlalchemy] Re: sqlite func,datetime

2007-07-06 Thread Ian Charnas


On Jul 6, 7:06 am, Nick [EMAIL PROTECTED] wrote:
 thanks, but that didn't work either. In the end i changed the func.db
 call to a datetime.datetime.now() python call which seems to work. I
 also noticed that i had to change all my date inserts to datetime
 objects as well (previously i was using postgres and inserted dates as
 date strings i.e. '2007-01-01' which seemed to work ok).
 I'm new to SA so perhaps i should have been doing that all along
 anyway.
 Cheers
 Nick

If you're ever planning to load-balance your web app, you probably
want the database to select the time instead of your various
application servers.  The SQL-92 way to do this (in UTC time!) is:

SELECT TIMESTAMP WITH TIME ZONE CURRENT_TIMESTAMP AT TIME ZONE '00:00'

you could just put that inside a text() block like so:

now = text(SELECT TIMESTAMP WITH TIME ZONE CURRENT_TIMESTAMP AT TIME
ZONE '00:00')

and then use default=now in your table definition.

-Ian Charnas


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalchemy@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en
-~--~~~~--~~--~--~---



[sqlalchemy] [TICKET 614] [PATCH] - one-liner to fix orig_set on scalar selects

2007-06-20 Thread Ian Charnas

http://www.sqlalchemy.org/trac/ticket/614

basically this line:
self.orig_set = []

needs to be this line:
self.orig_set = util.Set()

or certain selects will throw an error saying list type has no
attribute 'add'

I couldn't find a sqlalchemy-tickets list, so I thought I'd post here.

-Ian Charnas


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalchemy@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en
-~--~~~~--~~--~--~---



[sqlalchemy] Trying to detect which class methods were added by the mapper.

2007-06-12 Thread Ian Charnas

Inspired by the SQLAlchemy docs, I'm writing a documentation generator
in python using a combination of epydoc (for parsing/introspection),
genshi (templates), docutils (for restructured text), and pygments
(syntax highlighting).. and I just noticed that the documentation for
classes mapped by SQLAlchemy always includes the methods like
select, count, get_by, etc that were added by the mapper.  This
is very undesirable, and I'm looking for a way to detect which methods
were added to the class by the SQLAlchemy mapper, and which methods
were there to begin with.

Does anyone have any ideas?  I was hoping there would be something
like Animal.select.mapper or Animal.select._sqlalchemy that I
could use to differentiate which methods were added by the mapper and
which were there originally, but I can't seem to find any such thing.

many thanks in advance,
-Ian Charnas from the Pagoda CMS team, www.pagodacms.org


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalchemy@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en
-~--~~~~--~~--~--~---



[sqlalchemy] Re: Executing SQL directly inside a thread local transaction?

2007-04-30 Thread Ian Charnas

Yes!  While most *any* query can be done using sqlalchemy... and you
*should* try to use sqlalchemy because then if you change databases
you won't have to go through all your code and figure out if any of
the sql is different if you're absolutely sure you really really
need to do a bare sql query, you can do this:

from sqlalchemy import *
engine = create_engine(sqlite:///test.db)
engine.execute(insert into people(first_name, last_name)
values('ian', 'charnas'))
engine.execute(select * from people).fetchall()
[(1, 'ian', 'charnas')]

note that metadata keeps track of tables, and session keeps track of
instances of mapped classes, so to do this you don't need to use
metadata or session whatsoever.

On Apr 30, 10:04 am, Andreas Jung [EMAIL PROTECTED] wrote:
 I have a scenario where we I need to execute SQL directly within a
 transaction that is local to a Python thread. Is there some way to this in
 SA? In this particular usecase there is no UOW involved. Or is it somehow
 to pass a SQL statement somehow to the session and its underlaying
 framework?

 Andreas

  application_pgp-signature_part
 1KDownload


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalchemy@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en
-~--~~~~--~~--~--~---



[sqlalchemy] Re: #446: Two deletions cascading to the same object can result in an error

2007-01-30 Thread Ian Kelly

  Finally, note that I'm _not_ asking for sqlalchemy to maintain the
  collections for me.  All I'm asking is for the cascade code not to
  attempt to delete objects that have already been deleted and flushed,
  or at least to safely handle the exception it raises when it does.

 OK, what behavior are you looking for  ?  it raises an exception right
 now.  whats unsafe about it ?

Well, it interrupts the cascading and leaves session.deleted in an
inconsistent state where not all of the object's dependents may be
included.  I was under the assumption that this meant those dependents
wouldn't be deleted, which would be a nightmare for trying to handle
the exception.  But it seems I was mistaken about that -- the flush
finds and deletes the remaining dependents anyway.  I could argue that
leaving session.deleted in an inconsistent state is still a bad thing,
but it's not nearly as severe as I had thought.

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalchemy@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en
-~--~~~~--~~--~--~---



[sqlalchemy] Re: #446: Two deletions cascading to the same object can result in an error

2007-01-30 Thread Ian

On Jan 30, 12:51 pm, Michael Bayer [EMAIL PROTECTED] wrote:
 On Jan 30, 2:35 pm, Ian Kelly [EMAIL PROTECTED] wrote:

Finally, note that I'm _not_ asking for sqlalchemy to maintain the
collections for me.  All I'm asking is for the cascade code not to
attempt to delete objects that have already been deleted and flushed,
or at least to safely handle the exception it raises when it does.

   OK, what behavior are you looking for  ?  it raises an exception right
   now.  whats unsafe about it ?

  Well, it interrupts the cascading and leaves session.deleted in an
  inconsistent state where not all of the object's dependents may be
  included.

 if that were true, thats a bug.  can you illustrate this ?  the
 deleted collection on the unit of work is never altered in any way
 until a flush() completes successfully.   in theory, the state of your
 program after an unsuccessful flush should be identical to that of
 before the flush.  this is one advantage to SA's we dont change your
 structures approach - not only is it a huge beast to manage the
 altering of structures, but we'd have to be able to roll the whole
 thing back too.

This happens as a result of a failed session.delete() operation.  The 
flush itself appears to be fine.  If you still want an example, I can 
whip one up.


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalchemy@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en
-~--~~~~--~~--~--~---