On Jul 19, 2010, at 1:53 PM, Adrian Price-Whelan wrote:

> Here is some more detailed information trying the query multiple ways:
> 
> Piping the command into psql and writing to a tmp file takes 12 seconds (tmp 
> file is 241MB):
> 
> time echo "SELECT spectrum.pk AS spectrum_pk, spectrum.ra AS spectrum_ra, 
> spectrum.dec AS spectrum_dec, spectrum.values AS spectrum_values, 
> spectrum.spectrum_header_pk AS spectrum_spectrum_header_pk, 
> spectrum.fiber_number AS spectrum_fiber_number, spectrum.inv_var AS 
> spectrum_inv_var, spectrum.and_mask AS spectrum_and_mask, spectrum.or_mask AS 
> spectrum_or_mask FROM spectrum JOIN spectrum_header ON spectrum_header.pk = 
> spectrum.spectrum_header_pk WHERE spectrum_header.filename = 
> 'spPlate-3586-55181.fits';" | psql spectradb -U postgres > tmp
> 
> real  0m12.052s
> user  0m2.501s
> sys   0m0.274s
> 
> engine.execute on the same query took ~6 seconds:
> spectra = engine.execute("SELECT spectrum.pk AS spectrum_pk, spectrum.ra AS 
> spectrum_ra, spectrum.dec AS spectrum_dec, spectrum.values AS 
> spectrum_values, spectrum.spectrum_header_pk AS spectrum_spectrum_header_pk, 
> spectrum.fiber_number AS spectrum_fiber_number, spectrum.inv_var AS 
> spectrum_inv_var, spectrum.and_mask AS spectrum_and_mask, spectrum.or_mask AS 
> spectrum_or_mask FROM spectrum JOIN spectrum_header ON spectrum_header.pk = 
> spectrum.spectrum_header_pk WHERE spectrum_header.filename = 
> 'spPlate-3586-55181.fits';")

Call fetchall() on the result to get a better idea what's going on.    Here's 
the source of the ARRAY type:

    def result_processor(self, dialect, coltype):
        item_proc = self.item_type.result_processor(dialect, coltype)
        if item_proc:
            def convert_item(item):
                if isinstance(item, list):
                    return [convert_item(child) for child in item]
                else:
                    return item_proc(item)
        else:
            def convert_item(item):
                if isinstance(item, list):
                    return [convert_item(child) for child in item]
                else:
                    return item
        def process(value):
            if value is None:
                return value
            return [convert_item(item) for item in value]
        return process

as you can see, if your ARRAY is of a Unicode type or similar, convert() must 
be called on each item (only happens during fetch).  This is a likely cause of 
the slowdown and you should consider what kind of converters you're using on 
your ARRAY members.


-- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To post to this group, send email to sqlalch...@googlegroups.com.
To unsubscribe from this group, send email to 
sqlalchemy+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en.

Reply via email to