Re: [sqlalchemy] Performance questions on semi-large result sets

2015-10-22 Thread Gaëtan de Menten
On Wed, Oct 21, 2015 at 11:43 PM, Mike Bayer  wrote:

> If we OTOH use native_unicode=False and use the Unicode type for the
> columns we care about, that seems a lot faster, e.g. changing the
> mapping to:

[...]

> When we use the Unicode type and set use_native_unicode=False,
> SQLAlchemy's own use of Python unicode codecs takes place; this takes
> place within the C extensions and has been highly optimized for speed in
> the way that the codec is invoked.  We've already observed that
> cx_Oracle's native unicode methodologies turn out to be much slower than
> SQLAlchemy's, which was unexpected since cx_Oracle is pure C code, and
> in this case psycopg2 is also pure C code so it is again surprising
> SQLAlchemy's approach is a little faster.

Back then when I created the C extensions, I found that out as well:
virtually all backends were slower at decoding unicode than our own
unicode handling. At that time, there was no easy way (that I knew of,
at least) to turn off the native decoding of the DBAPI via SQLAlchemy.
I am glad to see that it now exists !

Congratulations to Mike for his dedication and endurance in the
project, You never cease to impress.

Gaëtan

-- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.


Re: [sqlalchemy] Re: Elixir question

2012-02-05 Thread Gaëtan de Menten

On 02/03/2012 12:08 PM, lars van gemerden wrote:

I should probably make the pair method:

def pair(name1, name2):
 p1, p2 = Pairs(name1), Pairs(name2)
 p1.other = p2
 p2.other = p1

On Feb 3, 11:57 am, lars van gemerdenl...@rational-it.com  wrote:

Hi, I am trying to sote pairs in a table as follows:

#-- 

from elixir import *

metadata.bind = sqlite:///:memory:
metadata.bind.echo = False

class Pairs(Entity):
 name = Field(String(50), primary_key = True)
 other = OneToOne('Pairs', inverse = 'other')
You can't have a OneToOne as inverse for a OneToOne, even less for 
itself. Valid relationship pairs are:


ManyToOne - OneToOne
ManyToOne - OneToMany
ManyToMany - ManyToMany

In your case you want:

class Pairs(Entity):
name = Field(String(50), primary_key = True)
other1 = ManyToOne('Pairs', inverse = 'other2')
other2 = OneToOne('Pairs', inverse = 'other1')

and if your database really only stores pairs, a property might make it 
more elegant:


@property
def other(self):
return self.other1 if self.other1 is not None else self.other2


As a side note, you probably do not want to use Elixir for a new 
project, as Elixir is not maintained anymore.


-G.

--
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalchemy@googlegroups.com.
To unsubscribe from this group, send email to 
sqlalchemy+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en.



Re: [sqlalchemy] Re: Order by in select doesn't seem to work

2012-01-09 Thread Gaëtan de Menten
On Tue, Dec 27, 2011 at 16:42, Michael Bayer mike...@zzzcomputing.com wrote:

 On Dec 27, 2011, at 10:37 AM, VDK wrote:

 Michael,

 I simplified my code just for test purposes. I'm now working with only
 two columns, without order_by clause, commented a few other lines with
 order by. I'm sure there is no order_by left in my code. I run the
 script with echo set to debug. The result:

 INFO:sqlalchemy.engine.base.Engine.0x...a7d0:{}
 DEBUG:sqlalchemy.engine.base.Engine.0x...a7d0:Col
 ('contributie_alle_leden_id', 'contributie_alle_leden_achternaam')
 2011-12-27 16:23:05,839 INFO sqlalchemy.engine.base.Engine.0x...a7d0
 SELECT contributie_alle_leden.id AS contributie_alle_leden_id,
 contributie_alle_leden.achternaam AS
 contributie_alle_leden_achternaam
 FROM (SELECT Leden.id AS id, Leden.achternaam AS achternaam
 FROM Leden) AS contributie_alle_leden ORDER BY
 contributie_alle_leden.id

 The order by is still added. As sqlalchemy doesn't add things, the
 only suspect now is elixir. This is part of the Camelot framework and
 act as a layer upon SQLAlchemy.

 *elixir* - I just searched your other emails and it appears they fail to 
 mention this extremely critical detail.   Elixir adds default order_bys.   
 You need to use elixir's configuration flags to disable that.

Hey! Elixir is clearly not perfect, but that one was uncalled for
because Elixir *does not* add default order_bys (unless you tell it
to, of course). So it is either Camelot's fault or the user code.

In [1]: from elixir import *

In [2]: class A(Entity):
   ...: name = Field(String(20))
   ...:

In [3]: metadata.bind = 'sqlite://'

In [4]: metadata.bind.echo = True

In [5]: setup_all(True)
2012-01-09 09:53:08,694 INFO sqlalchemy.engine.base.Engine PRAGMA table_info(__
main___a)
2012-01-09 09:53:08,696 INFO sqlalchemy.engine.base.Engine ()
2012-01-09 09:53:08,698 INFO sqlalchemy.engine.base.Engine
CREATE TABLE __main___a (
id INTEGER NOT NULL,
name VARCHAR(20),
PRIMARY KEY (id)
)


2012-01-09 09:53:08,700 INFO sqlalchemy.engine.base.Engine ()
2012-01-09 09:53:08,704 INFO sqlalchemy.engine.base.Engine COMMIT

In [6]: A.query.all()
2012-01-09 09:53:15,782 INFO sqlalchemy.engine.base.Engine BEGIN (implicit)
2012-01-09 09:53:15,783 INFO sqlalchemy.engine.base.Engine SELECT __main___a.id
AS __main___a_id, __main___a.name AS __main___a_name
FROM __main___a
2012-01-09 09:53:15,786 INFO sqlalchemy.engine.base.Engine ()
Out[6]: []

 I think this email thread is going to become the textbook example of why we 
 took out default order by, the next time someone asks.

-- 
Gaëtan de Menten

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalchemy@googlegroups.com.
To unsubscribe from this group, send email to 
sqlalchemy+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en.



Re: [sqlalchemy] performance vs. psycopg2

2011-12-19 Thread Gaëtan de Menten
On Fri, Dec 16, 2011 at 15:58, Jon Nelson jnel...@jamponi.net wrote:

 Revised to use:

 for row in rows:
  dict(row) # throw away result
  count += 1

I wonder how this could even work... iterating over the row yields
individual values, not tuples?!

I wonder what kind of column types you are using. Could you post your
code for both your tests (with and without SA)?

 SQLAlchemy: 115,000 to 120,000 rows/s (vs. psycopg2 @ 480K - 580K, or
 psycopg2 COPY @ 620K).

 I suspect the issue is that I'm only selecting one column, so the
 per-row overhead is exaggerated.

That is certainly a factor but even then, your numbers seem strange
(at least to me).

-- 
Gaëtan de Menten

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalchemy@googlegroups.com.
To unsubscribe from this group, send email to 
sqlalchemy+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en.



Re: [sqlalchemy] performance vs. psycopg2

2011-12-16 Thread Gaëtan de Menten
On Thu, Dec 15, 2011 at 19:52, Jon Nelson jnel...@jamponi.net wrote:
 On Thu, Dec 15, 2011 at 12:01 PM, Michael Bayer
 mike...@zzzcomputing.com wrote:

 On Dec 15, 2011, at 12:51 PM, Jon Nelson wrote:

 Up front, I'm not using the ORM at all, and I'm using SQLAlchemy 0.7.4
 with psycopg2 2.4.3 on PostgreSQL 8.4.10 on Linux x86_64.

 I did some performance testing. Selecting 75 million rows (a straight
 up SELECT colA from tableA) from a 5GB table yielded some interesting
 results.
 psycopg2 averaged between 485,000 and 585,000 rows per second.
 Using COPY (via psycopg2) the average was right around 585,000.
 sqlalchemy averaged between 160,000 and 190,000 rows per second.
 That's a pretty big difference.

Weird, IIRC, SA was much closer than raw psycopg2 (without using
COPY), in the range of SA adding a 50% overhead, not a 200% overhead.

 I briefly looked into what the cause could be, but I didn't see
 anything jump out at me (except RowProxy, maybe).
 Thoughts?

 Performance tests like this are fraught with complicating details (such as, 
 did you fully fetch each column in each row in both cases?  Did you have 
 equivalent unicode and numeric conversions in place in both tests ? ).   In 
 this case psycopg2 is written in pure C and SQLAlchemy's result proxy only 
 partially (did you use the C extensions ?).    You'd use the Python 
 profiling module to get a clear picture for what difference there is in 
 effort.   But using any kind of abstraction layer, especially one written in 
 Python, will always add latency versus a pure C program.

 I pretty much did this:
 for row in rows:
  count += 1

That test is probably flawed, as you don't fetch actual values. You
should try to access individual elements (either by iterating over the
row, or indexing it one way or another -- the speed difference can
vary quite a bit depending on that). You might get even worse results
with a proper test though ;-).

-- 
Gaëtan de Menten

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalchemy@googlegroups.com.
To unsubscribe from this group, send email to 
sqlalchemy+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en.