[sqlalchemy] Re: Searching TEXT fields

2006-11-29 Thread Julien Cigar

I suggest that you use an indexer like tsearch2 or lucene / xapian / ...
Indexes cannot be used with the LIKE operator.

José de Paula Eufrásio Júnior wrote:
 What's the better form of doing that? To minize DB usage and stuff...

   


-- 
Julien Cigar
Belgian Biodiversity Platform
http://www.biodiversity.be
Université Libre de Bruxelles
Campus de la Plaine CP 257
Bâtiment NO, Bureau 4 N4 115C (Niveau 4)
Boulevard du Triomphe, entrée ULB 2
B-1050 Bruxelles
office: [EMAIL PROTECTED]
home: [EMAIL PROTECTED]


--~--~-~--~~~---~--~~
 You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalchemy@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en
-~--~~~~--~~--~--~---



[sqlalchemy] Re: Searching TEXT fields

2006-11-29 Thread Lee McFadden

On 11/29/06, José de Paula Eufrásio Júnior [EMAIL PROTECTED] wrote:

 What's the better form of doing that? To minize DB usage and stuff...


That all depends on the back end that you'll be using and how portable
you want your code to be.  IMHO, for searching a database with large
amounts of text, MySQL's full text indexing and searching features are
unparalleled[1].  With a few text queries and the ability to map
arbitrary results to your objects in SA, you can't go wrong.

However, if you're not deploying your app to a specific platform or
you can't use MySQL for any reason, I would go with an indexer as
Julien suggested.

[1]: http://dev.mysql.com/doc/refman/4.1/en/fulltext-search.html

Lee

-- 
Lee McFadden

blog: http://www.splee.co.uk
work: http://fireflisystems.com
skype: fireflisystems

--~--~-~--~~~---~--~~
 You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalchemy@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en
-~--~~~~--~~--~--~---



[sqlalchemy] Re: working with detached objects

2006-11-29 Thread Michael Bayer

if thats true, thats a bug.  can you make me a small test case ?


--~--~-~--~~~---~--~~
 You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalchemy@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en
-~--~~~~--~~--~--~---



[sqlalchemy] Re: Searching TEXT fields

2006-11-29 Thread José de Paula Eufrásio Júnior

On 11/29/06, Lee McFadden [EMAIL PROTECTED] wrote:
 

 That all depends on the back end that you'll be using and how portable
 you want your code to be.  IMHO, for searching a database with large
 amounts of text, MySQL's full text indexing and searching features are
 unparalleled[1].  With a few text queries and the ability to map
 arbitrary results to your objects in SA, you can't go wrong.

 However, if you're not deploying your app to a specific platform or
 you can't use MySQL for any reason, I would go with an indexer as
 Julien suggested.

MySQL is right now the only DB I can use so I'll go for the fulltext.
Sadly, fulltext don't work on InnoDB and my tables where all created
with InnoDB engine. Do you know if there's any problem using mixed
engine types? I can let only the table where the text I want to search
is using MyISAM and the rest using InnoDB...

On other question, is there a parameter on the Table declaration for
SA to create FULLTEXT indexes? I'm trying to find on the docs right
now but maybe you know it already :)

The fulltext search from mysql looks really nice. I'll add a point for
MySQL on my current score, maybe this will redeem it from the utf-8
madness I had some months ago :P

thanks

-- 
José de Paula Eufrásio Júnior
aka coredump
http://core.eti.br

--~--~-~--~~~---~--~~
 You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalchemy@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en
-~--~~~~--~~--~--~---



[sqlalchemy] Re: Cascade performance

2006-11-29 Thread Michael Bayer

i doubt this was any faster in previous releases since the basic
metholodgy of cascade hasnt changed; when you attach object B to object
A, it cascades the save-update operation across the entire graph
represented by B.  While there was one little fix a while back so that
it wouldnt do cascade if B was already in the session, that doesnt
address this situation where you are attaching an unsaved instance
which contains references to a whole graph of saved instances.

so ive added your test with an extra assertion that the session in fact
contains 611 instances to the test/perf directory, and added an extra
argument to the cascade functions called halt_on which indicates to
stop cascading if a condition is met; session sets sends the condition
as c in self so that cascading along save/update/save-update will
cease along a branch if the instance is detected to be in the session
already (i.e. assumes all of its child instances are handled).  thats
rev 2116 and the results are now:

Create forward associations
Time to create item 0: 0.07357 sec
Time to create item 1: 0.10025 sec
Time to create item 2: 0.04157 sec
Time to create item 3: 0.06601 sec
Time to create item 4: 0.04751 sec
Time to create item 5: 0.06988 sec
Time to create item 6: 0.03998 sec
Time to create item 7: 0.07138 sec
Time to create item 8: 0.04332 sec
Time to create item 9: 0.07191 sec
Created 610 objects in 0.62538 sec

Create backward associations
Time to create item 0: 0.03061 sec
Time to create item 1: 0.05590 sec
Time to create item 2: 0.03099 sec
Time to create item 3: 0.07053 sec
Time to create item 4: 0.04608 sec
Time to create item 5: 0.06852 sec
Time to create item 6: 0.03841 sec
Time to create item 7: 0.07422 sec
Time to create item 8: 0.03793 sec
Time to create item 9: 0.06976 sec
Created 610 objects in 0.52296 sec


--~--~-~--~~~---~--~~
 You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalchemy@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en
-~--~~~~--~~--~--~---



[sqlalchemy] Re: Searching TEXT fields

2006-11-29 Thread Lee McFadden

On 11/29/06, José de Paula Eufrásio Júnior [EMAIL PROTECTED] wrote:

 Responding to myself: mixing InnoDB and MyISAM seems impossible. Looks
 like if a key on refers to other table, both table have to use the
 same engine... As I use a lot of many-to-many, I ended with all my
 tables MyISAM :P

Yeah, all your tables will have to be MyISAM.  I've found that it
doesn't actually make any difference as the relationships are handled
by SA anyway.

 And how I create arbitrary queries like that:

 select post_title, post_body from post where match (post_title,
 post_body) against ('nasty midgets');

 on SA?


match_query = post_table.select(MATCH (post_title, post_body) AGAINST (:q))

results = match_query.execute(q=nasty midgets)

# Coerce the results into your post object:
object_list = session.query(Post).instances(results)


Personally I add the query to the object that you want to search then
add a search method that does the last two lines of code.  That way
you can just call:
Post.search(nasty midgets)

Lee


-- 
Lee McFadden

blog: http://www.splee.co.uk
work: http://fireflisystems.com
skype: fireflisystems

--~--~-~--~~~---~--~~
 You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalchemy@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en
-~--~~~~--~~--~--~---



[sqlalchemy] Re: Searching TEXT fields

2006-11-29 Thread José de Paula Eufrásio Júnior

On 11/29/06, Lee McFadden [EMAIL PROTECTED] wrote:
  And how I create arbitrary queries like that:
 
  select post_title, post_body from post where match (post_title,
  post_body) against ('nasty midgets');
 
  on SA?
 

 match_query = post_table.select(MATCH (post_title, post_body) AGAINST (:q))

:q ?

 results = match_query.execute(q=nasty midgets)

 # Coerce the results into your post object:
 object_list = session.query(Post).instances(results)


 Personally I add the query to the object that you want to search then
 add a search method that does the last two lines of code.  That way
 you can just call:
 Post.search(nasty midgets)

Got it. Thanks.

-- 
José de Paula Eufrásio Júnior
aka coredump
http://core.eti.br

--~--~-~--~~~---~--~~
 You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalchemy@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en
-~--~~~~--~~--~--~---



[sqlalchemy] Order by lost in subselect?

2006-11-29 Thread James Taylor

I want to use order by and limit in a sub select, but it doesn't seem  
to work:

 Code 

import pkg_resources
pkg_resources.require( sqlalchemy )
pkg_resources.require( pysqlite )

from sqlalchemy import *

metadata = BoundMetaData( 'sqlite:tmp/test.db' )
metadata.engine.echo = True

class A( object ):
 pass

class B( object ):
 pass

A.table = Table( table_a, metadata,
 Column( id, Integer, primary_key=True),
 Column( name, String(20) ) )

B.table = Table( table_b, metadata,
 Column( id, Integer, primary_key=True),
 Column( name, String(20) ),
 Column( table_a_id, Integer, ForeignKey( table_a.id ) ),
 Column( score, Float ) )

mapper( A, A.table )
mapper( B, B.table )

print Get the 10 highest scoring rows from B:

print select( [B.c.name], from_obj=[B.table], order_by=[ B.c.score ],  
limit=10 )

print Get the 10 highest scoring rows from B, join to A and reorder

print select( [A.c.name, B.c.name, B.c.score],
   B.c.id.in_( select( [B.c.id], from_obj=[B.table],  
order_by=[ B.c.score ], limit=10, correlate=False ) ),
   from_obj=[ A.table.join( B.table ) ],
   order_by = [ A.c.id, B.c.id ] )


 Output 

Get the 10 highest scoring rows from B:
SELECT table_b.name
FROM table_b ORDER BY table_b.score
LIMIT 10 OFFSET 0

Get the 10 highest scoring rows from B, join to A and reorder
SELECT table_a.name, table_b.name, table_b.score
FROM table_a JOIN table_b ON table_a.id = table_b.table_a_id
WHERE table_b.id IN (SELECT table_b.id AS id
FROM table_b
LIMIT 10 OFFSET 0) ORDER BY table_a.id, table_b.id



--~--~-~--~~~---~--~~
 You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalchemy@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en
-~--~~~~--~~--~--~---



[sqlalchemy] SQLAlchemy at pycon 07

2006-11-29 Thread Jonathan Ellis

My proposal for a talk on SqlSoup was accepted.  It looks like someone
else's talk on SA itself was accepted too.  Woot! :)

-- 
Jonathan Ellis
http://spyced.blogspot.com

--~--~-~--~~~---~--~~
 You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalchemy@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en
-~--~~~~--~~--~--~---



[sqlalchemy] Re: SQLAlchemy at pycon 07

2006-11-29 Thread Karl Guertin

On 11/29/06, Jonathan Ellis [EMAIL PROTECTED] wrote:
 My proposal for a talk on SqlSoup was accepted.  It looks like someone
 else's talk on SA itself was accepted too.  Woot! :)

I'm not seeing a list of accepted talks on us.pycon.org; any links?

--~--~-~--~~~---~--~~
 You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalchemy@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en
-~--~~~~--~~--~--~---



[sqlalchemy] Re: SQLAlchemy at pycon 07

2006-11-29 Thread Jonathan Ellis

Right now you can only see the status of proposals you submitted
yourself, but the final schedule is probably only a couple days away
from being announced.

On 11/29/06, Karl Guertin [EMAIL PROTECTED] wrote:

 On 11/29/06, Jonathan Ellis [EMAIL PROTECTED] wrote:
  My proposal for a talk on SqlSoup was accepted.  It looks like someone
  else's talk on SA itself was accepted too.  Woot! :)

 I'm not seeing a list of accepted talks on us.pycon.org; any links?


-- 
Jonathan Ellis
http://spyced.blogspot.com

--~--~-~--~~~---~--~~
 You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalchemy@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en
-~--~~~~--~~--~--~---



[sqlalchemy] Re: Order by lost in subselect?

2006-11-29 Thread Michael Bayer

in chaneset 2120 i made a change to the general contract of the in_()
function:

- sending a selectable to an IN no longer creates a union out of
multiple
selects; only one selectable to an IN is allowed now (make a union
yourself
if union is needed; explicit better than implicit, dont guess, etc.)

unions automatically blow away the ORDER BY clause since some DBs (i
dont remember which) dont allow ORDER BY in the individual components
of UNION.  so now union is not used here.


--~--~-~--~~~---~--~~
 You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalchemy@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en
-~--~~~~--~~--~--~---



[sqlalchemy] Re: Cascade performance

2006-11-29 Thread Daniel Miller


Michael Bayer wrote:
 i doubt this was any faster in previous releases since the basic
 metholodgy of cascade hasnt changed

Probably wasn't, I've just been testing with larger data sets lately.

 so ive added your test with an extra assertion that the session in fact
 contains 611 instances to the test/perf directory, and added an extra
 argument to the cascade functions called halt_on which indicates to
 stop cascading if a condition is met; session sets sends the condition
 as c in self so that cascading along save/update/save-update will
 cease along a branch if the instance is detected to be in the session
 already (i.e. assumes all of its child instances are handled).  thats
 rev 2116 and the results are now:
 
 Create forward associations
 ...
 Created 610 objects in 0.62538 sec
 
 Create backward associations
 ...
 Created 610 objects in 0.52296 sec

Thanks a million Mike! Works like a charm. It's interesting that it's now 
(slightly) faster to add them the backward way than it is to add them the 
forward way. I double-checked the results and I get the same behavior on my 
machine. Is there more room for optimization maybe?

~ Daniel

--~--~-~--~~~---~--~~
 You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalchemy@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en
-~--~~~~--~~--~--~---



[sqlalchemy] Re: Cascade performance

2006-11-29 Thread Michael Bayer

well things like this, i.e. cascade not going over the same field of
objects over and over again, are big and obvious.  smaller things, its
mostly the attributes package that adds the overhead in...i put that
package through a huge overhaul some versions ago to simplify it, and i
ran it repeatedly through profiling to try to cut it down as much as
possible.  the key is to remove as much on_set/on_get behavior as
possible and move it all to the lets just figure it all out at flush
time stage.

but for it to be much faster at this point would require a major
paradigm shift in how it works.  when you look at it, some of what it
does might appear wasteful but then, theres usually a unit test that
will break if you try to simplify it further.


--~--~-~--~~~---~--~~
 You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalchemy@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en
-~--~~~~--~~--~--~---