Fw: Trans.: Re: Finding the row number satisfying a conditon in a result set

SGreen Tue, 31 Jan 2006 06:24:58 -0800

Oops - I too forgot the list! 
----- Forwarded by Shawn Green/Unimin on 01/31/2006 09:19 AM -----


Shawn Green/Unimin
01/31/2006 09:06 AM

To
Jacques Brignon <[EMAIL PROTECTED]>
cc

Subject
Re: Trans.: Re: Finding the row number satisfying a conditon in a result 
set





Thank you for your response!  :-)

How to implement option 1 depends on your client-side environment. If you 
have an application that runs completely client-side then your results are 
already client-side when you ask for them and you don't have to worry 
about copying the data to the client in an array.  If you are developing a 
web site then things change a bit. It is possible using DHTML(XHTML, or 
whatever they are calling it this week) to send all of the data to the 
client in the form of the HTML to create an array (usually a javascript 
array) within the browser page used to view the data. Then client-side 
scripting is used to scroll through the results (by creating and 
recreating a <TABLE>) and show the user just the "pages" you want them to 
see.  A variant on this is to have a data browser page surrounding a data 
retrieval page (inside an IFRAME) and you manipulate the inner page from 
the outer page by controlling the scrolling in code (a variant of this 
theme would be to have the data frame hidden and you use client-side 
script to pick just portions of it for display.) Another way to speed this 
up would be to cache the results server-side in a session-level variable 
or in a static table that is uniquely identified within the session. Then 
as the user browses through the data, you don't need to run the original 
query multiple times to get to the particular subset of records you want 
to show. You can take it straight from your cache on the web server. A 
fourth option could be to use a client-server protocol like SOAP to 
actually query the database from the client interactively. However, this 
would still cause the database to execute your main query every time you 
wanted just a page of data. 

You already identified the need to minimize trips into the database. You 
just need to workout the best way for your application's design how to do 
that. Odds are, it's going to involve the temporary storage of your main 
query somewhere (a cache of the results). It may also require the building 
of an index array or two.  Look up the "quicksort" and "binary search" 
algorithms if you take this route. They are very efficient and I have used 
them before on large sets of data with good performance results.

I am sorry I can't be more specific but there are many approaches to this 
technique and I am not sure which one will work best for your situation. 
Let me know if I can help in any way.

Yours,

Shawn Green
Database Administrator
Unimin Corporation - Spruce Pine

Jacques Brignon <[EMAIL PROTECTED]> wrote on 01/31/2006 07:52:16 AM:

> Thanks Shawn for the detailed answer,
> 
> What I am currently doing is basically what you propose, I do a full 
query to
> retreive the row numbers of the subset I want to display and of the 
"selected"
> record if any in that subset, then I use another query with LIMIT toget 
those
> rows for display.
> 
> What I am trying to do is to improve the performance by limiting 
thenumber of
> queries and by identifying the most efficient way of finding the 
rownumber of
> the search record. I am currently using brute force by loopiong through 
the
> result set until I find the record. The proposal of storing the set in a 
temp
> table should improve that, allowing to retrieve the row by a query on 
that
> table which we can expect to be faster.
> 
> So As you correctly describe, what I need is to allow the user to 
> scroll through
> the set, and as you correctly describe, I am therefore usiong your 
option 2
> doing one query to locate the rows and one with limit to get those to be
> displayed and of course I am hitting performance issues. I also noticed 
that
> all the queries using limit do not run at the same speed, the more you 
get
> close to the end of the data set the more it takes time.
> 
> I uderstand the approach number 3 using a temp table, I am also 
intersted in
> your approach number 1 but I am not sure to understand what you mean and 
how
> you do that using the PHP MySql function libray.
> Do you mean passing all the rows of the result at once to the client
> application
> and storing them in memory (an array)? If the result set is big, 
> couldn't we hit
> some limits or experience other performance issues? I see how to getin 
PHP the
> values of one row of the result set, how do you get all the rows at once 
other
> than looping through the result set and getting one row after the other?
> 
> --
> Jacques Brignon
> 
> Selon [EMAIL PROTECTED]:
> 
> > Jacques Brignon <[EMAIL PROTECTED]> wrote on 01/30/2006 10:18:59 
AM:
> >
> > > Oops! forgoten to include the list in the relply
> > >
> > > --
> > > Jacques Brignon
> > >
> > > ----- Message transféré de Jacques Brignon <[EMAIL PROTECTED]> 
-----
> > >    Date : Mon, 30 Jan 2006 16:16:53 +0100
> > >      De : Jacques Brignon <[EMAIL PROTECTED]>
> > > Adresse de retour :Jacques Brignon <[EMAIL PROTECTED]>
> > >   Sujet : Re: Finding the row number satisfying a conditon in a 
result
> > set
> > >       À : Jake Peavy <[EMAIL PROTECTED]>
> > >
> > > Selon Jake Peavy <[EMAIL PROTECTED]>:
> > >
> > > > On 1/30/06, Jacques Brignon <[EMAIL PROTECTED]> wrote:
> > > > >
> > > > > I would like some advice on the various and best ways of finding 
the
> > rank
> > > > > of the
> > > > > row  which satisfies a given condition in a rsult set.
> > > > >
> > > > > Let's assume that the result set includes a field containing an
> > identifier
> > > > > from
> > > > > one of the table used in the query and that not two rows have 
the
> > same
> > > > > value
> > > > > for this identifier but that the result set does not contains 
all
> > the
> > > > > sequential values for this identifier and/or the values are not
> > sorted in
> > > > > any
> > > > > predictable order.
> > > > >
> > > > > The brute force method is to loop through all the rows of the 
result
> > set,
> > > > > until
> > > > > the number is found to get the rank of the row. That does not 
seem
> > very
> > > > > clever
> > > > > and it can be very time consuming if the set has a lot of rows.
> > > >
> > > >
> > > >
> > > > use ORDER BY with a LIMIT of 1
> > > >
> > > > your subject line needs work though - a "row number" has no 
meaning in
> > a
> > > > relational database.
> > > >
> > > > -jp
> > > >
> > >
> > > Thanks for the tip, I am going to think to it as I do not see right 
away
> > how
> > > this solves the problem.
> > >
> > > I agree with your comment, This is precisely because the result row
> > number is
> > > not in the database that I need to find it.
> > >
> > > The problem I am trying to solve is the following:
> > >
> > > A query returns a result set with a number of rows, lets say 15000 
as an
> > > example.
> > >
> > > I have an application wich displays those 10 by 10 with arrows
> > > based navigation
> > > capabilities (first page, previous page, next page, last page).
> > >
> > > I also have a search capability and I need to find in which set of 
10
> > results
> > > the row I search for will be diplayed in order to show directly the
> > > appropriate
> > > page and to know what is the rank of this row in the result set or 
in
> > the page
> > > to show the searched result row "selected".
> > >
> > > As an example the row having a customer id of 125, would have the 
row #
> > 563 in
> > > the result set (not orderd by customer id but by some other 
criterion
> > like
> > > name) and would therefore be displayed in the page showing result 
rows
> > 561 to
> > > 570
> > >
> > > When I say row I do not mean a row in any table but a row in the 
result
> > set
> > > produced by the query which can touch several tables.
> > >
> > > None of the fields of the result set contains the row number, it is 
just
> >  the
> > > number of time I have to loop through the result set to get the row 
in
> > the set
> > > which matches my criterion.
> > >
> > > I hope this makes my question clearer.
> > >
> > > I am sure this is a pretty common problem, but I have not yet 
figured
> > out the
> > > clever way to tackle it!
> > >
> > > --
> > > Jacques Brignon
> > > ----- Fin du message transféré -----
> >
> > Yes, that is much clearer. Assuming that your results ARE ordered by 
some
> > criteria (such as by name) so that the sequence of one query execution
> > closely resembles that of another then you can artificially create a
> > sequence number by saving those results into a temporary table with an
> > auto_increment column.
> >
> > CREATE TEMPORARY TABLE tmpResults (
> >   rownum int unsigned auto_increment
> >   , name varchar(50) not null,
> >   , ... other columns in your results ...
> >   , primary_key (rownum)
> >   , key(name)
> > );
> >
> > INSERT tmpResults (name,... other columns ...)
> > SELECT name, ... other columns ...
> > ... (rest of query) ...;
> >
> > Now you have somewhere that has a row number on each row of your 
query. In
> > most applications, it is more efficient to either send the whole 
recordset
> > to the client and display the results in pages based on the cached 
results
> > or to run a smaller query of just those fields you want to search by 
and
> > send them to the client as a form of index. Then the client can ask 
the
> > server for the FULL query and use the LIMIT offsets you had from the
> > "index" query.
> >
> > What it boils down to is this:
> >
> > a) What you have described may look user friendly but it is database
> > intensive. Your application performance will probably suffer.
> > b) Most queries only ask for what they actually need. If you only 
wanted
> > results that match a certain name or other condition, only ask for 
those
> > rows. That may mean modifying your client so that it only asks for the
> > rows the user wants to see.
> > c) If you want your user to "scroll through" a set of results you have
> > three simple options:
> >   1) pull them all on the first query and navigate the results 
client-side
> > (very fast, quite scalable, most flexible)
> >   2) re-execute the query multiple times on the server and work with 
each
> > separate result set. This can become highly intensive if what you 
actually
> > wanted to show is in the 100th iteration of the query.
> >   3) use a temporary (or static) table to cache and serialize your
> > results. Use it to navigate to the subset of records you seek.
> >
> >
> > Shawn Green
> > Database Administrator
> > Unimin Corporation - Spruce Pine
> 
>

Fw: Trans.: Re: Finding the row number satisfying a conditon in a result set

Reply via email to