[sqlite] Bad performance with large joins?

2006-03-24 Thread Steffen Schwigon
Hi!

I'm trying to use the database of the OpenGeoDB project
(http://opengeodb.hoppe-media.com/index.php?FrontPage_en)
with SQLite.

Simple example queries from opengeodb work ok, like

SELECT text_val
FROM geodb_textdata
WHERE text_type=50010 /* NAME */ AND
  loc_id=27431;

But the less trivial queries are very slow:

SELECT code.text_val as area code, name.text_val as town
FROM geodb_textdata code, geodb_textdata name
WHERE name.loc_id=code.loc_id AND
  code.text_type=50030 /* AREA_CODE */ AND
  name.text_type=50010 /* NAME */
ORDER by 2;

Results just dripple out slowly. Particularly I have to use it
*without* the ORDER BY, else the result takes forever. Postgresql
solves it much faster, with or without ORDER BY.

The only difference in the data is that I used INTEGER instead of
Postgres' Boolean type.

The OpenGeoDB examples are taken from here:

  http://sourceforge.net/docman/display_doc.php?docid=27614group_id=132421

If you want to look at the DB, I have importable data here:

  http://renormalist.net/opengeodb/opengeodb-sqlite.sql.gz
  http://renormalist.net/opengeodb/opengeodb-postgres.sql.gz
  (2 MB each, 26 MB after gunzipping)

The Database is described here:

  http://sourceforge.net/docman/index.php?group_id=132421


Is there something I can optimize in SQLite? E.g., I'm not sure, that
the indexes really work. Syntax seems to match the documentation, but
I'm a sqlite newbie and no db wizard.

(Greeti+Tha)nX
Steffen 
-- 
Steffen Schwigon [EMAIL PROTECTED]
Dresden Perl Mongers http://dresden-pm.org/


Re: [sqlite] Bad performance with large joins?

2006-03-24 Thread Steffen Schwigon
Roger [EMAIL PROTECTED] writes:
 Have you tried creating indexes on your rows.
 [..]
 I suggest you add indexes on text_val

Yes. I use

  create index text_val_idx on geodb_textdata(text_val);

I just experimented with dropping/recreating indexes and it sometimes
even feels a bit faster *without* the index. Anyway, both variants are
slow, nearly same speed.

Another idea: the data are utf-8, can this be a problem?
Do I have to declare this somewhere at import ore runtime?
Can I set it to non-unicode, to see performance difference?

(Greeti+Tha)nX
Steffen 
-- 
Steffen Schwigon [EMAIL PROTECTED]
Dresden Perl Mongers http://dresden-pm.org/


Re: AW: [sqlite] Bad performance with large joins?

2006-03-24 Thread Steffen Schwigon
Christian Schwarz [EMAIL PROTECTED] writes:
  Have you tried creating indexes on your rows.
  [..]
  I suggest you add indexes on text_val

 Yes. I use

   create index text_val_idx on geodb_textdata(text_val);


 This index seems pretty useless. You're querying against
 geodb_textdata.loc_id and geodb_textdata.text_type. So you should create
 an index over these columns.

Sorry, I just named this one in my reply.
In the DB there are much more indexes:

 [...]
 create index text_lid_idx on geodb_textdata(loc_id);
 create index text_val_idx on geodb_textdata(text_val);
 create index text_type_idx on geodb_textdata(text_type);
 create index text_locale_idx on geodb_textdata(text_locale);
 create index text_native_idx on geodb_textdata(is_native_lang);
 create index text_default_idx on geodb_textdata(is_default_name);
 create index text_since_idx on geodb_textdata(valid_since);
 create index text_until_idx on geodb_textdata(valid_until);
 [...]

Practically one such line for each table and each column.
If you want to see the whole db as import script, have a look at

  http://renormalist.net/opengeodb/opengeodb-sqlite.sql.gz

This DB was originally a Postgres one. I just changed the boolean
true/false into integer-0/1 and everything else at least syntactically
worked. Maybe I'm missing some other syntax that SQLite accepts but
silently ignores or handles differently.

GreetinX
Steffen 
-- 
Steffen Schwigon [EMAIL PROTECTED]
Dresden Perl Mongers http://dresden-pm.org/


Re: AW: [sqlite] Bad performance with large joins?

2006-03-24 Thread Steffen Schwigon
Christian Schwarz [EMAIL PROTECTED] writes:
 Practically one such line for each table and each column.

 Why on each column?

I just took the existing DB-import-script from that project. But ...


 For example, when your where-clause contains columns A, B and C (in
 this order) you should create *one* index on A, B and C.

... you are right, creating a combined index solves the speed problem.
Thanks.


 Separate indexes on column A, B and C are not that useful. In this
 case, SQLite would most probably use the separate index on column A.

Which would be a pity, wouldn't it? Postgres for instance seems to do
something more clever there, at least it's much faster, even with the
trivial setting of an index on each column.

Anyway, thanks for your answer.

GreetinX
Steffen 
-- 
Steffen Schwigon [EMAIL PROTECTED]
Dresden Perl Mongers http://dresden-pm.org/