Re: AW: Query optimization problem

Euke Castellano Thu, 03 Feb 2005 07:00:24 -0800

Becker, Holger wrote:

Euke Castellano wrote:
I have the following table definition:
CREATE TABLE "CONSUM"(
  "ID" Integer NOT NULL,
  "HOTEL" Integer NOT NULL,
  "SERVICE" Integer NOT NULL,
  "SEGMENT" Integer NOT NULL,
  "GUEST" Integer,
  "COMPANY" Fixed (38,0),
  "AGENCY" Fixed (38,0),
  "REPRESENTATIVE" Fixed (38,0),
  "INVOICEDATE" Date,
  "CHARGEDATE" Date,
  "QUANTITY" Fixed (12,2) DEFAULT 0.00,
  "AMOUNT" Fixed (12,2) DEFAULT 0.00,
  "ROOMNIGHTS" Fixed (12,2) DEFAULT 0.00,
  "RATE" Integer,
  "SEASON" Char (1) ASCII,
  "PROCESSDATE" Date,
  PRIMARY KEY ("ID")
)
I also have an UNIQUE INDEX created on column INVOICEDATE.
The table has ~25.000.000 rows.
When I try a query like: SELECT * FROM consum WHERE invoiceDate between '2004-01-01' AND '2004-12-31' it takes very few seconds to show the information using
SQLStudio, but

if I try this other query: SELECT * FROM consum WHERE invoiceDate between '2004-01-01' AND '2004-12-31' AND rate=9 it take several hours!! to process the information.

What can I do to optimize this kind of queries? Should I create an index for each column that I need to query? Or is it better

to create
an index that includes all the columns of the query?
Thank you very much for your help and sorry for my english.
The index is not UNIQUE. Sorry and thanks.
Hi,
did you see the whole result in SQL Studio for the first select or is it possible that you only have a look on the first n rows?
SQL Studio only fetches those rows which are requested by the user
when he scrolls through the result set.
So I suppose that you only gets the first rows very fast because many or all rows are in the range you asked for and it would last much longer to scroll through the whole result.

At your second query you looks for rows that have rate=9 and if only few rows fulfil this condition it last much longer to find the first n rows.

You could speed up your query with a multiple index over invoiceDate and rate: "create index i2 on consum (invoiceDate,rate)"
Kind regards
Holger

Thanks Holger for your answer:

You're right on your appreciation. I perfectly understand you're explain but I consider the performance of this statement is not good enough for my application. I do a simple program via JDBC in order to test this statement:

********* BEGIN JAVA
{.......}
public class TestConsum {

{.......} public static void main(String[] args) {.......}{ Connection cn = DriverManager.getConnection(url, user, passwd); String sel = "SELECT id,hotel,service,segment,rate,invoiceDate FROM consum " + " WHERE invoiceDate BETWEEN '2004-01-01' AND '2004-01-31' AND rate=23"; PreparedStatement stmt = cn.prepareStatement(sel, ResultSet.TYPE_FORWARD_ONLY, ResultSet.CONCUR_READ_ONLY); stmt.setFetchDirection(ResultSet.FETCH_FORWARD); System.out.println( "BEGIN ..: " + new Date()); ResultSet rs = stmt.executeQuery(); for (int rowCounter=0;rs.next();rowCounter++){ if (rowCounter % 5000 == 0) { System.out.println( rowCounter + " --> " + new Date()); } } System.out.println( "END ..: " + new Date()); rs.close(); stmt.close(); cn.close(); } } ********* END JAVA

As you see, no extra operations are made. The output:

********** BEGIN TRACE
BEGIN  ..: Wed Feb 02 18:18:24 CET 2005
0      --> Wed Feb 02 18:19:01 CET 2005
5000   --> Wed Feb 02 18:21:43 CET 2005
10000  --> Wed Feb 02 18:24:02 CET 2005
15000  --> Wed Feb 02 18:26:06 CET 2005
20000  --> Wed Feb 02 18:27:54 CET 2005
25000  --> Wed Feb 02 18:28:51 CET 2005
30000  --> Wed Feb 02 18:30:36 CET 2005
35000  --> Wed Feb 02 18:32:57 CET 2005
40000  --> Wed Feb 02 18:34:38 CET 2005
45000  --> Wed Feb 02 18:36:00 CET 2005
50000  --> Wed Feb 02 18:37:37 CET 2005
55000  --> Wed Feb 02 18:38:45 CET 2005
60000  --> Wed Feb 02 18:39:51 CET 2005
65000  --> Wed Feb 02 18:40:51 CET 2005
70000  --> Wed Feb 02 18:42:04 CET 2005
75000  --> Wed Feb 02 18:44:20 CET 2005
80000  --> Wed Feb 02 18:46:43 CET 2005
85000  --> Wed Feb 02 18:48:44 CET 2005
90000  --> Wed Feb 02 18:50:04 CET 2005
95000  --> Wed Feb 02 18:51:23 CET 2005
100000 --> Wed Feb 02 18:52:40 CET 2005
105000 --> Wed Feb 02 18:53:34 CET 2005
110000 --> Wed Feb 02 18:54:33 CET 2005
115000 --> Wed Feb 02 18:55:19 CET 2005
120000 --> Wed Feb 02 18:56:45 CET 2005
125000 --> Wed Feb 02 18:58:59 CET 2005
130000 --> Wed Feb 02 19:01:13 CET 2005
135000 --> Wed Feb 02 19:02:56 CET 2005
140000 --> Wed Feb 02 19:04:13 CET 2005
145000 --> Wed Feb 02 19:05:39 CET 2005
150000 --> Wed Feb 02 19:06:43 CET 2005
155000 --> Wed Feb 02 19:07:55 CET 2005
160000 --> Wed Feb 02 19:09:16 CET 2005
165000 --> Wed Feb 02 19:10:18 CET 2005
170000 --> Wed Feb 02 19:11:37 CET 2005
END    ..: Wed Feb 02 19:12:26 CET 2005
********** END TRACE

�Do you think is normal the process takes almost an hour to scan ~170,000 rows?

Sorry for my english, if something is not understood, please ask me.
Thank you very much.

Euke.


--
MaxDB Discussion Mailing List
For list archives: http://lists.mysql.com/maxdb
To unsubscribe:    http://lists.mysql.com/[EMAIL PROTECTED]

Re: AW: Query optimization problem

Reply via email to