I frequently use DISTINCT in subselects to speed up processing
SELECT ... WHERE colname IN ( SELECT DISTINCT colname FROM .....) Try your queries with and without the DISTINCT. Updates benefit even more, often reducing the time drasticly. UPDATE ... WHERE colname IN ( SELECT DISTINCT colname FROM .....) Dennis McGrath --- Jim Limburg <[EMAIL PROTECTED]> wrote: > Thank You Mr R:> > > This is a real good lesson. I see how this would be more effective. > Now if I can get some of our db's more normalized and > setup to utiilize this it will be a great benefit. > > Again. Thank You > Jim Limburg > > A. Razzak Memon wrote: > > > At 01:17 PM 3/3/2004 -0500, Jim Limburg wrote: > > > >> I have a question about this that you wrote: > >> > >> 08. Use correlated sub-selects or multi-table selects instead > >> -- of sub-selects > >> > >> A multi-table select uses indexes to join the tables. A correlated > >> sub-select uses indexes to restrict the list of values to be > >> compared within the sub-select. A plain sub-select will not use > >> indexes. Each row of data from the main query must be compared to > >> every row selected by the sub-select. > >> > >> My question is basically along the lines of not understanding the > >> above comment. Please remember those of us more SQL challenged > >> individuals. When you say plain sub-select, do you mean something > >> like this SELECT * FROM ITEMS WHERE custnum IN (SELECT custnum > from > >> SPECIALORDERS WHERE custnum contains '1234'). > >> Now, does a corrlated sub-select use the T1, T2 type interface? > >> > >> Just wanting to learn. > > > > > > > > Jim, > > > > To illustrate, compare the results obtained by using the different > > techniques on two tables, each with 1,000 unique rows (each row in > > Table1 has only one match in Table2). > > > > (1) A multi-table select, shown below, took 00:05. > > > > SELECT collist FROM table1,table2 + > > WHERE Table1.LinkCol = Table2.LinkCol > > > > (2) A correlated sub-select, which looked at every row in Table1 > > but needed to look at only one row in Table2 for each row in > > Table1, takes less than 00:10. The example is shown below: > > > > SELECT collist FROM Table1 WHERE LinkCol IN + > > (SELECT LinkCol FROM Table2 WHERE > Table2.LinkCol=Table1.LinkCol) > > > > (3) A plain sub-select, which had to look for a match in every row > > in Table2 for each of the 1,000 rows of Table1, will take more > > than 00:10. (three times, maybe..) to complete. This example, > > by far the slowest, is shown below: > > > > SELECT collist FROM Table1 WHERE LinkCol IN + > > (SELECT LinkCol FROM Table2) > > > > SELECT is a command in which your data really affects the > performance. > > The results you see and the method you find best will depend on the > > number of rows and distribution of data in the tables you are > combining. > > > > You should try those techniques in your real time database and see > what > > you find. > > > > Hope that helps! > > > > Very Best R:egards, > > > > Razzak. > > > > > > A. Razzak Memon wrote: > > > >>> At 03:48 PM 3/2/2004 -0800, Victor Timmons wrote: > >>> > >>>> I would like some tips on how to speedup rbase when it > >>>> running on a network. > >>> > >>> > >>> Victor, > >>> > >>> Here are a few more tips in addition to the suggestions > >>> posted earlier by fellow R:BASE users/developers: > >>> > >>> -- > >>> 01. Use the most current version of R:BASE > >>> -- (I know you are using the latest & greatest version!) > >>> > >>> R:BASE Technologies is continually making speed improvements, > >>> and each new release is faster than the previous one. For > >>> example, building indexes in R:BASE 6.5/7.x is significantly > >>> faster than in previous versions. > >>> > >>> You can even PACK the INDEXes faster in 6.5++, even in a > >>> Multi-User environment. > >>> > >>> PACK INDEX > >>> or > >>> PACK INDEX FOR TableName > >>> > >>> -- > >>> 02. Think globally instead of row by row > >>> -- > >>> This is one of the best ways to achieve some significant speed > >>> improvements. Instead of doing row-by-row processing using a > >>> DECLARE CURSOR, try using a single UPDATE or INSERT command, > >>> which will take advantage of the SQL select capabilities of > >>> R:BASE. > >>> > >>> For example, suppose you want to create a table to hold > >>> information about how many of each product a customer has > >>> purchased. The most logical approach is to set up a cursor > >>> to go through the detail table and count the rows for each > >>> customer/product combination and then insert that data into > >>> the new table. On a table with 3500 rows, processing all > >>> the rows may take few minutes. By replacing the DECLARE > >>> CURSOR loop with a single insert command, the time required > >>> to process all the rows may take 75% less. > >>> > >>> Compare the two pieces of code: > >>> > >>> DECLARE c1 CURSOR FOR SELECT CustID,PartNo FROM CPProdDet + > >>> GROUP BY CustID,PartNo > >>> OPEN c1 > >>> FETCH c1 INTO + > >>> vCustID INDIC ivCustID, + > >>> vPartNo INDIC ivPartNo > >>> WHILE SQLCODE < > 100 THEN > >>> SELECT COUNT(*) INTO vCount FROM CPProdDet WHERE + > >>> CustID = .vCustID AND PartNo =.vPartNo > >>> INSERT INTO CustProd (CustID,PartNo,Copies) VALUES + > >>> (.vCustID,.vPartNo,.vCount) > >>> FETCH c1 INTO + > >>> vCustID INDIC ivCustID, + > >>> vPartNo INDIC ivPartNo > >>> ENDWHILE > >>> DROP CURSOR c1 > >>> > >>> vs. > >>> > >>> INSERT INTO CustProd (CustID,PartNo,Copies) + > >>> SELECT CustID,PartNo,Count(*) + > >>> FROM CPProdDet GROUP BY CustID,PartNo > >>> > >>> -- > >>> 03. Be Creative > >>> -- > >>> R:BASE almost always offers two or more ways of doing something. > >>> Look for an alternative method for accomplishing a task - it > >>> might just be considerably faster. > >>> > >>> -- > >>> 04. Change the environment > >>> -- > >>> The R:BASE environment settings that have the most effect on > >>> performance are SET MESSAGES. > >>> > >>> SET MESSAGES OFF eliminates the time required for screen display > >>> of messages. This is particularly noticeable if you are doing > >>> repetitive UPDATE or INSERT commands. > >>> > >>> -- > >>> 05. Reduce Number of Expressions in Reports > >>> -- > >>> Because all report expressions are processed for each row of > >>> data read from the reports driving table or view, reducing the > >>> number of expressions in a report increases the printing speed > >>> of a report. > >>> > >>> -- > >>> 06. Abbreviate commands to four characters > >>> -- > >>> R:BASE parses command lines into 4-byte tokens - the fewer > >>> tokens to read, the faster the command executes. > >>> > >>> -- > >>> 07. Combine commands whenever possible > >>> -- > >>> Many R:BASE commands allow access to more than one column or > >>> variable and don't require using separate commands for each. > >>> This reduces the number of times R:BASE must parse and read a > >>> command. The fewer commands to read, the faster R:BASE executes. > >>> > >>> -- > >>> 08. Use correlated sub-selects or multi-table selects instead > >>> -- of sub-selects > >>> > >>> A multi-table select uses indexes to join the tables. A > correlated > >>> sub-select uses indexes to restrict the list of values to be > >>> compared within the sub-select. A plain sub-select will not use > >>> indexes. Each row of data from the main query must be compared to > === message truncated ===

