Oops - I too forgot the list! ----- Forwarded by Shawn Green/Unimin on 01/31/2006 09:19 AM -----
Shawn Green/Unimin 01/31/2006 09:06 AM To Jacques Brignon <[EMAIL PROTECTED]> cc Subject Re: Trans.: Re: Finding the row number satisfying a conditon in a result set Thank you for your response! :-) How to implement option 1 depends on your client-side environment. If you have an application that runs completely client-side then your results are already client-side when you ask for them and you don't have to worry about copying the data to the client in an array. If you are developing a web site then things change a bit. It is possible using DHTML(XHTML, or whatever they are calling it this week) to send all of the data to the client in the form of the HTML to create an array (usually a javascript array) within the browser page used to view the data. Then client-side scripting is used to scroll through the results (by creating and recreating a <TABLE>) and show the user just the "pages" you want them to see. A variant on this is to have a data browser page surrounding a data retrieval page (inside an IFRAME) and you manipulate the inner page from the outer page by controlling the scrolling in code (a variant of this theme would be to have the data frame hidden and you use client-side script to pick just portions of it for display.) Another way to speed this up would be to cache the results server-side in a session-level variable or in a static table that is uniquely identified within the session. Then as the user browses through the data, you don't need to run the original query multiple times to get to the particular subset of records you want to show. You can take it straight from your cache on the web server. A fourth option could be to use a client-server protocol like SOAP to actually query the database from the client interactively. However, this would still cause the database to execute your main query every time you wanted just a page of data. You already identified the need to minimize trips into the database. You just need to workout the best way for your application's design how to do that. Odds are, it's going to involve the temporary storage of your main query somewhere (a cache of the results). It may also require the building of an index array or two. Look up the "quicksort" and "binary search" algorithms if you take this route. They are very efficient and I have used them before on large sets of data with good performance results. I am sorry I can't be more specific but there are many approaches to this technique and I am not sure which one will work best for your situation. Let me know if I can help in any way. Yours, Shawn Green Database Administrator Unimin Corporation - Spruce Pine Jacques Brignon <[EMAIL PROTECTED]> wrote on 01/31/2006 07:52:16 AM: > Thanks Shawn for the detailed answer, > > What I am currently doing is basically what you propose, I do a full query to > retreive the row numbers of the subset I want to display and of the "selected" > record if any in that subset, then I use another query with LIMIT toget those > rows for display. > > What I am trying to do is to improve the performance by limiting thenumber of > queries and by identifying the most efficient way of finding the rownumber of > the search record. I am currently using brute force by loopiong through the > result set until I find the record. The proposal of storing the set in a temp > table should improve that, allowing to retrieve the row by a query on that > table which we can expect to be faster. > > So As you correctly describe, what I need is to allow the user to > scroll through > the set, and as you correctly describe, I am therefore usiong your option 2 > doing one query to locate the rows and one with limit to get those to be > displayed and of course I am hitting performance issues. I also noticed that > all the queries using limit do not run at the same speed, the more you get > close to the end of the data set the more it takes time. > > I uderstand the approach number 3 using a temp table, I am also intersted in > your approach number 1 but I am not sure to understand what you mean and how > you do that using the PHP MySql function libray. > Do you mean passing all the rows of the result at once to the client > application > and storing them in memory (an array)? If the result set is big, > couldn't we hit > some limits or experience other performance issues? I see how to getin PHP the > values of one row of the result set, how do you get all the rows at once other > than looping through the result set and getting one row after the other? > > -- > Jacques Brignon > > Selon [EMAIL PROTECTED]: > > > Jacques Brignon <[EMAIL PROTECTED]> wrote on 01/30/2006 10:18:59 AM: > > > > > Oops! forgoten to include the list in the relply > > > > > > -- > > > Jacques Brignon > > > > > > ----- Message transféré de Jacques Brignon <[EMAIL PROTECTED]> ----- > > > Date : Mon, 30 Jan 2006 16:16:53 +0100 > > > De : Jacques Brignon <[EMAIL PROTECTED]> > > > Adresse de retour :Jacques Brignon <[EMAIL PROTECTED]> > > > Sujet : Re: Finding the row number satisfying a conditon in a result > > set > > > À : Jake Peavy <[EMAIL PROTECTED]> > > > > > > Selon Jake Peavy <[EMAIL PROTECTED]>: > > > > > > > On 1/30/06, Jacques Brignon <[EMAIL PROTECTED]> wrote: > > > > > > > > > > I would like some advice on the various and best ways of finding the > > rank > > > > > of the > > > > > row which satisfies a given condition in a rsult set. > > > > > > > > > > Let's assume that the result set includes a field containing an > > identifier > > > > > from > > > > > one of the table used in the query and that not two rows have the > > same > > > > > value > > > > > for this identifier but that the result set does not contains all > > the > > > > > sequential values for this identifier and/or the values are not > > sorted in > > > > > any > > > > > predictable order. > > > > > > > > > > The brute force method is to loop through all the rows of the result > > set, > > > > > until > > > > > the number is found to get the rank of the row. That does not seem > > very > > > > > clever > > > > > and it can be very time consuming if the set has a lot of rows. > > > > > > > > > > > > > > > > use ORDER BY with a LIMIT of 1 > > > > > > > > your subject line needs work though - a "row number" has no meaning in > > a > > > > relational database. > > > > > > > > -jp > > > > > > > > > > Thanks for the tip, I am going to think to it as I do not see right away > > how > > > this solves the problem. > > > > > > I agree with your comment, This is precisely because the result row > > number is > > > not in the database that I need to find it. > > > > > > The problem I am trying to solve is the following: > > > > > > A query returns a result set with a number of rows, lets say 15000 as an > > > example. > > > > > > I have an application wich displays those 10 by 10 with arrows > > > based navigation > > > capabilities (first page, previous page, next page, last page). > > > > > > I also have a search capability and I need to find in which set of 10 > > results > > > the row I search for will be diplayed in order to show directly the > > > appropriate > > > page and to know what is the rank of this row in the result set or in > > the page > > > to show the searched result row "selected". > > > > > > As an example the row having a customer id of 125, would have the row # > > 563 in > > > the result set (not orderd by customer id but by some other criterion > > like > > > name) and would therefore be displayed in the page showing result rows > > 561 to > > > 570 > > > > > > When I say row I do not mean a row in any table but a row in the result > > set > > > produced by the query which can touch several tables. > > > > > > None of the fields of the result set contains the row number, it is just > > the > > > number of time I have to loop through the result set to get the row in > > the set > > > which matches my criterion. > > > > > > I hope this makes my question clearer. > > > > > > I am sure this is a pretty common problem, but I have not yet figured > > out the > > > clever way to tackle it! > > > > > > -- > > > Jacques Brignon > > > ----- Fin du message transféré ----- > > > > Yes, that is much clearer. Assuming that your results ARE ordered by some > > criteria (such as by name) so that the sequence of one query execution > > closely resembles that of another then you can artificially create a > > sequence number by saving those results into a temporary table with an > > auto_increment column. > > > > CREATE TEMPORARY TABLE tmpResults ( > > rownum int unsigned auto_increment > > , name varchar(50) not null, > > , ... other columns in your results ... > > , primary_key (rownum) > > , key(name) > > ); > > > > INSERT tmpResults (name,... other columns ...) > > SELECT name, ... other columns ... > > ... (rest of query) ...; > > > > Now you have somewhere that has a row number on each row of your query. In > > most applications, it is more efficient to either send the whole recordset > > to the client and display the results in pages based on the cached results > > or to run a smaller query of just those fields you want to search by and > > send them to the client as a form of index. Then the client can ask the > > server for the FULL query and use the LIMIT offsets you had from the > > "index" query. > > > > What it boils down to is this: > > > > a) What you have described may look user friendly but it is database > > intensive. Your application performance will probably suffer. > > b) Most queries only ask for what they actually need. If you only wanted > > results that match a certain name or other condition, only ask for those > > rows. That may mean modifying your client so that it only asks for the > > rows the user wants to see. > > c) If you want your user to "scroll through" a set of results you have > > three simple options: > > 1) pull them all on the first query and navigate the results client-side > > (very fast, quite scalable, most flexible) > > 2) re-execute the query multiple times on the server and work with each > > separate result set. This can become highly intensive if what you actually > > wanted to show is in the 100th iteration of the query. > > 3) use a temporary (or static) table to cache and serialize your > > results. Use it to navigate to the subset of records you seek. > > > > > > Shawn Green > > Database Administrator > > Unimin Corporation - Spruce Pine > >