Re: [sqlite] Slow JOIN on two indexed text fields. Why?????

2009-12-01 Thread sorka

Thank you! That was it. I've been pulling my hair out over this all day.
I should have seen it. I've never used STRING in my own tables and I
inherited this from someone else and didn't even think twice that the type
difference would be the issue.

Thanks you again.


sorka wrote:
> 
> This is driving me nuts. I have two tables I'm trying to join together on
> two text fields.
> 
> CREATE TABLE tmp_role (  programId   INTEGER,
>   roleNameINTEGER,
>   positionINTEGER,
>   isNew   BOOL,
>   personIdINTEGER,
>   nameSTRING);
> 
> This table has up to a few dozen records at any one time.
> 
> AND
> 
> CREATE TABLE person (
> personId INTEGER PRIMARY KEY,
> name text UNIQUE
> );
> 
> This table has 10s of thousands of records.
> 
> If I do this query:
> SELECT person.ROWID FROM tmp_role JOIN person ON tmp_role.name =
> person.name;
> 
> to find the ROWID of each row in person who's name matches that of the
> name in tmp_role, it takes about 1 second per matcha really long time.
> 
> However, if I instead take each of names in tmp_role and do a seperate
> select like this:
> SELECT ROWID FROM person WHERE name = "Carell|Steve"; 
> 
> and do it for each name, the search takes only a few ms for few dozen
> records in tmp_role.
> 
> Now the real problem I'm trying to solve is an UPDATE like this:
> 
> UPDATE tmp_role SET personId = (SELECT ROWID FROM person WHERE
> tmp_role.name = person.name);
> 
> If I break this up into a bunch of different statements to iterate through
> the records in tmp_role and then execute a single statement for each name,
> I can accomplish this update statement fairly quickly, but as it is
> authored above, it's taking about 24 seconds for 24 records or about 1000
> times longer than if I do it the long way :(
> 
> 
> 
> 
> 
> 

-- 
View this message in context: 
http://old.nabble.com/Slow-JOIN-on-two-indexed-text-fields.-Why--tp26601433p26602612.html
Sent from the SQLite mailing list archive at Nabble.com.

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Slow JOIN on two indexed text fields. Why?????

2009-12-01 Thread Igor Tandetnik
Igor Tandetnik  wrote:
> sorka  wrote:
>> This is driving me nuts. I have two tables I'm trying to join
>> together on two text fields.
>> 
>> CREATE TABLE tmp_role (  programId   INTEGER,
>>  roleNameINTEGER,
>>  positionINTEGER,
>>  isNew   BOOL,
>>  personIdINTEGER,
>>  nameSTRING);
>> 
>> This table has up to a few dozen records at any one time.
>> 
>> AND
>> 
>> CREATE TABLE person (
>>personId INTEGER PRIMARY KEY,
>>name text UNIQUE
>> );
>> 
>> This table has 10s of thousands of records.
>> 
>> If I do this query:
>> SELECT person.ROWID FROM ApgDb.tmp_role JOIN PgDb.person ON
>> tmp_role.name = person.name;
>> 
>> to find the ROWID of each row in person who's name matches that of
>> the name in tmp_role, it takes about 1 second per matcha really
>> long time.
> 
> The problem seems to be with the affinity of tmp_role.name column.
> STRING doesn't have any special meaning to SQLite, and so the column
> ends up with NUMERIC affinity. On the other hand, TEXT gives the
> column TEXT affinity. For some reason I don't quite understand, this
> prevents SQLite from using the index on person(name). Try this, it
> should run much faster: 
> 
> SELECT person.ROWID FROM tmp_role JOIN person ON cast(tmp_role.name
> as text) = person.name; 
> 
> For more details about data types, column affinity and such, see
> http://sqlite.org/datatype3.html 

Ah, now I understand why the index is not used. As described in section 3 
"Comparison Expressions" in the aforementioned document, when comparing NUMERIC 
and TEXT columns, an attempt is made to convert the text to a number, so that 
12 would be considered equal to '12' and '012' and '12.0'. But when the index 
on person(name) was built, it interpreted these three values as strings and 
considered them distinct. That's why the index can't be used.

Bottom line is, don't use STRING as column type, use TEXT or CHAR or similar.

Igor Tandetnik


___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Slow JOIN on two indexed text fields. Why?????

2009-12-01 Thread Igor Tandetnik
sorka  wrote:
> This is driving me nuts. I have two tables I'm trying to join
> together on two text fields.
> 
> CREATE TABLE tmp_role (  programId   INTEGER,
>  roleNameINTEGER,
>  positionINTEGER,
>  isNew   BOOL,
>  personIdINTEGER,
>  nameSTRING);
> 
> This table has up to a few dozen records at any one time.
> 
> AND
> 
> CREATE TABLE person (
>personId INTEGER PRIMARY KEY,
>name text UNIQUE
> );
> 
> This table has 10s of thousands of records.
> 
> If I do this query:
> SELECT person.ROWID FROM ApgDb.tmp_role JOIN PgDb.person ON
> tmp_role.name = person.name;
> 
> to find the ROWID of each row in person who's name matches that of
> the name in tmp_role, it takes about 1 second per matcha really
> long time. 

The problem seems to be with the affinity of tmp_role.name column. STRING 
doesn't have any special meaning to SQLite, and so the column ends up with 
NUMERIC affinity. On the other hand, TEXT gives the column TEXT affinity. For 
some reason I don't quite understand, this prevents SQLite from using the index 
on person(name). Try this, it should run much faster:

SELECT person.ROWID FROM tmp_role JOIN person ON cast(tmp_role.name as text) = 
person.name;

For more details about data types, column affinity and such, see 
http://sqlite.org/datatype3.html

Igor Tandetnik

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Slow JOIN on two indexed text fields. Why?????

2009-12-01 Thread sorka

This is driving me nuts. I have two tables I'm trying to join together on two
text fields.

CREATE TABLE tmp_role (  programId   INTEGER,
  roleNameINTEGER,
  positionINTEGER,
  isNew   BOOL,
  personIdINTEGER,
  nameSTRING);

This table has up to a few dozen records at any one time.

AND

CREATE TABLE person (
personId INTEGER PRIMARY KEY,
name text UNIQUE
);

This table has 10s of thousands of records.

If I do this query:
SELECT person.ROWID FROM ApgDb.tmp_role JOIN PgDb.person ON tmp_role.name =
person.name;

to find the ROWID of each row in person who's name matches that of the name
in tmp_role, it takes about 1 second per matcha really long time.

However, if I instead take each of names in tmp_role and do a seperate
select like this:
SELECT ROWID FROM person WHERE name = "Carell|Steve"; 

and do it for each name, the search takes only a few ms for few dozen
records in tmp_role.

Now the real problem I'm trying to solve is an UPDATE like this:

UPDATE tmp_role SET personId = (SELECT ROWID FROM person WHERE tmp_role.name
= person.name);

If I break this up into a bunch of different statements to iterate through
the records in tmp_role and then execute a single statement for each name, I
can accomplish this update statement fairly quickly, but as it is authored
above, it's taking about 24 seconds for 24 records or about 1000 times
longer than if I do it the long way :(





-- 
View this message in context: 
http://old.nabble.com/Slow-JOIN-on-two-indexed-text-fields.-Why--tp26601433p26601433.html
Sent from the SQLite mailing list archive at Nabble.com.

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users