Re: mod_perl/passing session information (MVC related, maybe...)

2002-06-14 Thread Peter Bi

To Ward's first post: I think one may even doesn't need server cookie. Using
a client-site cookie fits exactly the need.

Peter

- Original Message -
From: Rob Nagler [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Sent: Thursday, June 13, 2002 7:49 PM
Subject: Re: mod_perl/passing session information (MVC related, maybe...)


 Perrin Harkins writes:
  My preferred design for this is to set one cookie that lasts forever and
  serves as a browser ID.

 I like this.  It's clean and simple.  In this sense, a browser is not
 really a session.  The only thing I don't like is garbage collection.

  unique browser ID (or session ID, if you prefer to give out a new one
  each time someone comes to the site) lets you track this for
  unregistered users.

 We call this a visitor id.  In the PetShop we have a cart id, but
 we're not too happy with the abstraction.

  I don't see that as a big deal.  You'd have to delete lots of other data
  associated with a user too.  Actually deleting a user is something I've
  never seen happen anywhere.

 We do.  Especially when we went from free to fee. :-(  The big issue I
 have with session data is that it is often a BLOB which you can't
 query.

  Well, eToys handled more than 2.5 million pages per hour, but caching
  can be important for much smaller sites in some situations.

 I'd like numbers on smaller and some. :)

  Here's a situation where a small site could need caching:

 We cache, too.  An interesting query is the club count on
 bivio.com's home page.  The count of clubs is a fast query, but the
 count of the members is not (about 4 seconds).  We compute a ratio
 when the server starts of the members to clubs.  We then run the club
 count query and use the ratio to compute the member count.  We restart
 the servers nightly, so the ratio is computed once a day.

  Maybe I just have bad luck, but I always seem to end up at companies
  where they give me requirements like these.

 It's the real world.  Denormalization is necessary, but only after you
 test the normal case.  One of the reasons I got involved in this
 discussion is that I saw a lot of messages about solutions and very
 few with numbers identifying the problem.

 Rob







RE: mod_perl/passing session information (MVC related, maybe...)

2002-06-13 Thread Vuillemot, Ward W

I am not sure if I follow completely.  How do you verify that user who the
user says he/she is?
I log into your web-site as memberA.  You kindly leave me a delicious cookie
with my username stored in it.  Maybe even my password (I hope not!).  Now,
I know that another member, memberB, has special rights to your site.  What
is stopping me from editting the cookie to memberB's username and hijacking
their account?  And if you do store the password information in the
cookie...you are letting each user be compromised either as the cookie is
flung through the Internet ether, or minimally on their own computer where
someone else can easily access the cookies.  

With sessionID, you have an ID and information that is checksum'd.  We keep
that sessionID in a DB once a user has successfully logged in.  Whenever I
check the sessionID I know they have already logged in -- on top of that,
there is no really useful information in the cookie that might compromise
security.  (Plus, the checksum ensures that one is tampering with the
cookie.)  But you do keep username, userlevel information within the session
information stored on your local DB -- where is reasonably safe from prying
eyes.

If I wanted to delete a user and ensure they immediately lost all access, it
is rather trivial to go through all active sessions in the db, see if the
user I am deleting matches the username in the session information, and if
so delete the session record.

:  
   :  * User logs in.
   :  * Site Admin decides to delete the user.
   :  * In our stateless servers, the user_id is invalidated 
   :  immediately.
   :  * Next request from User, he's implicitly logged out, 
   :  because the user_id
   :is verified on every request.
   :  
   :  In the case of a session-based server, you have to delete 
   :  the user and
   :  invalidate any sessions which the user owns.
   :  
 



RE: mod_perl/passing session information (MVC related, maybe...)

2002-06-13 Thread Drew Taylor

At 07:32 AM 6/13/02 -0700, Vuillemot, Ward W wrote:

I log into your web-site as memberA.  You kindly leave me a delicious cookie
with my username stored in it.  Maybe even my password (I hope not!).  Now,
I know that another member, memberB, has special rights to your site.  What
is stopping me from editting the cookie to memberB's username and hijacking
their account?
snip
(Plus, the checksum ensures that one is tampering with the
cookie.)

You touched this subject in the next paragraph. You should always include a 
hash or checksum as part of your cookie value. And then validate this info 
on each request. This prevents the situation you described where you just 
change the cookie. Even if the cookie value is just a session id, it is 
nice to have the hash to make sure they just don't go changing their 
cookie, but not necessary if your session IDs are random.

Drew





==
Drew Taylor  |  Freelance web development using
http://www.drewtaylor.com/   |  perl/mod_perl/MySQL/postgresql/DBI
mailto:[EMAIL PROTECTED]   |  Email jobs at drewtaylor.com
--
Speakeasy.net: A DSL provider with a clue. Sign up today.
http://www.speakeasy.net/refer/29655
==




RE: mod_perl/passing session information (MVC related, maybe...)

2002-06-13 Thread Rob Nagler

Vuillemot, Ward W writes:
 I log into your web-site as memberA.  You kindly leave me a delicious cookie
 with my username stored in it.  Maybe even my password (I hope not!).  Now,
 I know that another member, memberB, has special rights to your site.  What
 is stopping me from editting the cookie to memberB's username and hijacking
 their account?

If you can crack Blowfish, IDEA, etc., you are in.  Then again you can
probably just sniff the network for memberB's username and everybody
else's passwords for that matter, even via SSL.

Part of bOP is multi-tiered security architecture including something
I call data gateways to help protect against programmer mistakes.

 And if you do store the password information in the
 cookie...you are letting each user be compromised either as the cookie is
 flung through the Internet ether, or minimally on their own computer where
 someone else can easily access the cookies.

If you have access to someone's cookie file, you probably can log
their keystrokes.  Contact your local spy agency for more information
on how to do this.

 With sessionID, you have an ID and information that is checksum'd.

Sessions and user IDs are equivalent.  They are called credentials
which allow access to a system.  There's no fundamental difference
between hijacking a session or stealing a user id/password.

 If I wanted to delete a user and ensure they immediately lost all access, it
 is rather trivial to go through all active sessions in the db, see if the
 user I am deleting matches the username in the session information, and if
 so delete the session record.

Denormalization is the root of all evil.  The extra step involves more
code, more bugs, and more system resources.  Other than that, you're
right.  You can do this, but the question I ask: Do you need to?

Rob






Re: mod_perl/passing session information (MVC related, maybe...)

2002-06-13 Thread John Siracusa

On 6/13/02 11:04 AM, Rob Nagler wrote:
 With sessionID, you have an ID and information that is checksum'd.
 
 Sessions and user IDs are equivalent.  They are called credentials
 which allow access to a system.  There's no fundamental difference
 between hijacking a session or stealing a user id/password.

Well, given a user/pass, you can login form anywhere.  Given a session
that's tied to information like the remote IP, user agent, date, etc. etc.,
it's a lot harder to reuse that information to login from elsewhere.

-John




Re: mod_perl/passing session information (MVC related, maybe...)

2002-06-13 Thread Perrin Harkins

Rob Nagler wrote:
A session is useful for very limited things, like remembering if this 
user is logged in and linking him to a user_id.
 
 
 We store this information in the cookie.  I don't see how it could be
 otherwise.  It's the browser that maintains the login state.

My preferred design for this is to set one cookie that lasts forever and 
serves as a browser ID.  If that user logs in, you can associate a user 
ID with that browser ID, on the server side.  You never need to send 
another cookie after the very first time someone hits your site.  If you 
decide to attach new kinds of state information to the browser, you 
still don't need to send a new cookie.

Many sites need to keep track of state information (like what's in your 
shopping cart) for anonymous users who haven't logged in.  Having this 
unique browser ID (or session ID, if you prefer to give out a new one 
each time someone comes to the site) lets you track this for 
unregistered users.

 Consider the following scenario:
 
 * User logs in.
 * Site Admin decides to delete the user.
 * In our stateless servers, the user_id is invalidated immediately.
 * Next request from User, he's implicitly logged out, because the user_id
   is verified on every request.
 
 In the case of a session-based server, you have to delete the user and
 invalidate any sessions which the user owns.

I don't see that as a big deal.  You'd have to delete lots of other data 
associated with a user too.  Actually deleting a user is something I've 
never seen happen anywhere.

Although Oracle can be fast, some data models and application 
requirements make it hard to do live queries every time and still have 
decent performance.  This is especially true as traffic starts to
climb.
 
 
 I've tried to put numbers on some of this.  I've never worked on a
 1M/day site, so I don't know if this is the point where you need
 sessions.  What sites other than etoys needs this type of session
 caching?

Well, eToys handled more than 2.5 million pages per hour, but caching 
can be important for much smaller sites in some situations.  It's not 
session caching necessarilly, although we did cache session data in a 
local write-through cache on each server.

We knew that the database would probably be the bottleneck in scaling 
our application, and it was.  We took pains to take as much work as 
possible off the database, so that it could spend its resources on 
handling things that can't be cached, like user submitted data and orders.

Here's a situation where a small site could need caching: suppose you 
have a typical hierarchical catalog site, with a tree of categories that 
contain products.  Now suppose that the requirements for the site make 
it necessary to do a pretty hairy query to get the list of products in a 
category, because you have some sort of indirect association based on 
product attributes or something and you have to account for start and 
end dates on every product and various availability statuses, etc. 
Categories should only be shown if they have products in them or if 
their child categories have products in them.  Keep piling on business 
rules.  Then the UI design calls for the front page to have a 
Yahoo-style display showing multiple levels of the category hierarchy, 
maybe 70 categories or so.

Sure, you get your DBAs to tune the SQL and to put all the indexes in 
place, and it gets the results for a single category pretty fast, in .08 
seconds, but you have 70 of them!  When you throw multiple users in the 
mix, all executing these queries every time they hit the homepage, your 
database server will burn a hole through the floor.

Or you can take advantage of your domain knowledge, that the data used 
in generating this page only changes every 6 hours or so, and just cache 
the page, or part of the page, or the data, for an hour.

Maybe I just have bad luck, but I always seem to end up at companies 
where they give me requirements like these.  And then they say to make 
it really fast and handle a billion users.  They are happy to trade 
slightly stale data for very good performance, and part of the 
requirements gathering process involves finding out how often various 
kinds of data change and how much it matters if they are out of date. 
(For example, inventory data for products changes often and needs to be 
much more current than, say, user comments on that product.)

- Perrin

- Perrin




Re: mod_perl/passing session information (MVC related, maybe...)

2002-06-13 Thread Rob Nagler

Perrin Harkins writes:
 My preferred design for this is to set one cookie that lasts forever and 
 serves as a browser ID.

I like this.  It's clean and simple.  In this sense, a browser is not
really a session.  The only thing I don't like is garbage collection.

 unique browser ID (or session ID, if you prefer to give out a new one 
 each time someone comes to the site) lets you track this for 
 unregistered users.

We call this a visitor id.  In the PetShop we have a cart id, but
we're not too happy with the abstraction.

 I don't see that as a big deal.  You'd have to delete lots of other data 
 associated with a user too.  Actually deleting a user is something I've 
 never seen happen anywhere.

We do.  Especially when we went from free to fee. :-(  The big issue I
have with session data is that it is often a BLOB which you can't
query.

 Well, eToys handled more than 2.5 million pages per hour, but caching 
 can be important for much smaller sites in some situations.

I'd like numbers on smaller and some. :)

 Here's a situation where a small site could need caching:

We cache, too.  An interesting query is the club count on
bivio.com's home page.  The count of clubs is a fast query, but the
count of the members is not (about 4 seconds).  We compute a ratio
when the server starts of the members to clubs.  We then run the club
count query and use the ratio to compute the member count.  We restart
the servers nightly, so the ratio is computed once a day.

 Maybe I just have bad luck, but I always seem to end up at companies 
 where they give me requirements like these.

It's the real world.  Denormalization is necessary, but only after you
test the normal case.  One of the reasons I got involved in this
discussion is that I saw a lot of messages about solutions and very
few with numbers identifying the problem.

Rob





Re: mod_perl/passing session information (MVC related, maybe...)

2002-06-12 Thread Perrin Harkins

Vuillemot, Ward W wrote:
 There is a Apache::Session which is sufficient to check to see if they are
 logged in, et cetera.
 But I want to be able to remember the last query so that I can return
 results into multple pages along with memory of where in the stack I am at.

You can store anything in Apache::Session; it's just a persistent hash 
table.  However, storing query results based on a user's session is not 
a good idea!  What if your users open up two browser windows and tries 
to do a search in each one?  Server-side session data is global to all 
browser windows, so they'll get bizarre and incorrect results.  If you 
check any of the major sites you'll see that they handle multiple 
windows correctly.

My suggestions would be to have a separate cache just for query results. 
  Turn the sorted query parameters into a key.  If someone goes to page 
2 of the results, you just pull them out of the cache.

 There are persistent modules, but I am wondering if there is a better way
 with Apache and mod_perl

There have been a few benchmarks of ways to store a persistent hash. 
I'll have some new numbers on this soon, but for now I'd suggest looking 
at Cache::Cache, MLDBM::Sync, or Cache::Mmap.

- Perrin





Re: mod_perl/passing session information (MVC related, maybe...)

2002-06-12 Thread Ken Y. Clark

On Wed, 12 Jun 2002, Vuillemot, Ward W wrote:

 Date: Wed, 12 Jun 2002 06:58:24 -0700
 From: Vuillemot, Ward W [EMAIL PROTECTED]
 To: 'Peter Bi' [EMAIL PROTECTED], [EMAIL PROTECTED],
  Eric Frazier [EMAIL PROTECTED]
 Subject: mod_perl/passing session information (MVC related, maybe...)

 I was wondering how people are saving state between pages of a session.

 There is a Apache::Session which is sufficient to check to see if
 they are logged in, et cetera.  But I want to be able to remember
 the last query so that I can return results into multple pages along
 with memory of where in the stack I am at.  The easiest would to be
 store the query parameters along with the count information. . .but
 I do not want to use Apache::Session as I believe that has too much
 overhead for this sort of thing.  There are persistent modules, but
 I am wondering if there is a better way with Apache and mod_perl --
 that ppl have tried and can vouche for its validity.

 Thanks!
 Ward

Ward,

I do things like this all the time, though I wonder if I don't do it
the Hard Way.  Basically, I define a MAX_RESULTS per page (like 25)
and return the first set of records to the user.  To make the
clickable links to Previous, Next, and the 1-n pages, I've munged
the query results in Perl and a couple template packages to make each
link contain everything necessary to perform the query again
(including every parameter from the original request) and putting in
the appropriate limit_start number (or whatever you want to call
your limiting variable) for the set.

E.g., if I'm looking for all the records where name=foo and
size=M and I got back 100 results, with a MAX_RESULTS of 25, I'd
have to make four pages.  The second page might look like this:

a href=/search?name=foo;size=M;limit_start=0Previous/a |
a href=/search?name=foo;size=M;limit_start=01/a |
2 |
a href=/search?name=foo;size=M;limit_start=503/a |
a href=/search?name=foo;size=M;limit_start=754/a |
a href=/search?name=foo;size=M;limit_start=50Next/a

Now, that's a lot of stuff to make sure is in your output, and adding
or changing a parameter means a lot of fixing.  However, it is fairly
simple, and I can grok it, so I stick with it.  I'd be happy to hear
of better ways.

FWIW, I do pretty much the same thing to re-sort tables of data by
column headers.  So for a table of shirts with attributes of color
and price, I'd do something like:

FWIW, I do pretty much the same thing to re-sort tables of data by
column headers.  So for a table of shirts with attributes of color
and price, I'd do something like:

tr
...
tha href=/view_shirts?order_by=colorColor/a/th
tha href=/view_shirts?order_by=pricePrice/a/th
...
/tr

Sprinkle in the same code for limiting to a managable result set, and
those are all my tricks.

HTH,

ky




RE: mod_perl/passing session information (MVC related, maybe...)

2002-06-12 Thread Jeff AA


 From: Perrin Harkins [mailto:[EMAIL PROTECTED]] 
 Sent: 12 June 2002 15:11

 You can store anything in Apache::Session; it's just a persistent hash

 table.  However, storing query results based on a user's session is
not 
 a good idea!  What if your users open up two browser windows and tries

 to do a search in each one?  Server-side session data is global to all

 browser windows, so they'll get bizarre and incorrect results.

Agreed, but he wasn't talking about storing the results, just the query
parameters and current offset / number of rows, which is a-ok for
putting into a session.

some query session do's and don'ts...

Don't forget that you can have multiple sessions - store the query
params in a session identified by a query_id so that subsequent requests
just say something like: A
HREF='/searchquery_id=123456789action=next'Next/A

Don't mix transient query sessions with a User Session that stores info
about the user's logged in state etc. It would be normal for one user to
have multiple queries in a login session

Don't bother passing the query ids in cookies, they are not browser
session specific. Just use the query_id as a parameter in the
first/next/prev/last links as exampled above. You can then have a web
control page that handles multiple queries simultaneously

Do put the user_id into the query session and check it against the
user_id in the User session to prevent query hijack


 My suggestions would be to have a separate cache just for query
results. 

Or even to use a database that has a decent approach to caching. MySQL
promises automatic cacheable paged queries in the near future. And if
you write your own DB cache, you then need to manage the DB / cache
synch issues, cache size, cache expiry etc etc issues. Good cache is
very hard to do! better to get it from a real data bank.


 From: Vuillemot, Ward W [mailto:[EMAIL PROTECTED]] 
 Sent: 12 June 2002 14:58

 I want to be able to remember the last query so that I can return
 results into multple pages along with memory of where in the stack I
am at.
 The easiest would to be store the query parameters along with the
count
 information. . .but I do not want to use Apache::Session as I believe 
 that has too much overhead for this sort of thing.

Apache::Session is just what you want here! It is an easy peasy way to
remember things on the server, and you can implement it with whatever
type of storage underneath that you want [e.g. database] so that you can
even share sessions when your query is being served by multiple web
servers. If you look through the source, you will see that the overhead
is minimal. You can specialise the session persistence mechanism if you
want to for example store the key = value pairs as visible records in
the DB rather than a serialised blob.


Regards

Jeff





RE: mod_perl/passing session information (MVC related, maybe...)

2002-06-12 Thread Jeff AA


 From: Ken Y. Clark [mailto:[EMAIL PROTECTED]] 
 Sent: 12 June 2002 15:39

 I've munged the query results in Perl and a couple template 
 packages to make each link contain everything necessary to 
 perform the query again (including every parameter from the 
 original request) and putting in the appropriate limit_start
 number...

Using sessions and a query_id is a shortcut for this, instead of stating
all the complex parameters again, you just issue an id and put that into
the link. 

An advantage of the session/id is that you end up with stateful query
instances, and can remember [at least for a short period!] the total
number of items, so that you can say 'Results 1 to 10 of 34,566' without
having to count all results every time. This is also useful if you want
users to be able to jump to a LAST page, as you can for example calc the
starting point for limit statement easily.

One disadvantage is that you cannot link to the query result pages, as
you will no doubt expire the query sessions eventually. By putting all
the params in the link, Ken's way lets the users link to the results,
remember them in their favourites etc.

Another possible feature is to allow the link to override any of the
current query parameters, so to do a DB resort, you can say something
like A HREF=/query?query_id=12345order=colourSort by Colour/A and
the order param is not lost in amongst lots of other params. Obviously
changes to the where clause, ordering etc may invalidate current row /
page remembered values.

Further variations are readily available. You can create persistent
queries, rather than session queries, store the params in the DB and let
your users have their very own private or shareable searches. If you use
the optional param overide approach, you can store the params once, and
then the options as a separate query that refers to the underlying
query. When users add columns or other bits to the underlying, all child
searches will respect the change.

Regards
Jeff






Re: mod_perl/passing session information (MVC related, maybe...)

2002-06-12 Thread Perrin Harkins

Jeff AA wrote:
 Agreed, but he wasn't talking about storing the results, just the query
 parameters and current offset / number of rows, which is a-ok for
 putting into a session.

No, that's exactly what ISN'T okay for putting into a session.  If a 
user opens two browser windows, does a search in each, and then pages 
forward in each set of results, he will get completely wrong pages if 
you do this.  The query parameters from the first search will be written 
over and lost.

 Don't forget that you can have multiple sessions - store the query
 params in a session identified by a query_id so that subsequent requests
 just say something like: A
 HREF='/searchquery_id=123456789action=next'Next/A

You could do that, with a unique ID for each set of parameters, but you 
might as well just put the parameters right in the link unless they're 
very long.

 Don't mix transient query sessions with a User Session that stores info
 about the user's logged in state etc. It would be normal for one user to
 have multiple queries in a login session

Hold on, I think we actually agree, but you're using the word session 
for a bunch of different things.  What you're saying here sounds like 
the opposite of what you said above.  In common usage, a session is the 
state of the user's interaction with the application.  A cache of query 
data would be something else.

 Or even to use a database that has a decent approach to caching. MySQL
 promises automatic cacheable paged queries in the near future. And if
 you write your own DB cache, you then need to manage the DB / cache
 synch issues, cache size, cache expiry etc etc issues. Good cache is
 very hard to do! better to get it from a real data bank.

MySQL is fast, but usually not as fast as simple disk access. 
Cache::Cache and Cache::Mmap handle the details of the cache stuff for 
you, making it pretty easy.

- Perrin




RE: mod_perl/passing session information (MVC related, maybe...)

2002-06-12 Thread Eric Frazier

Hi,

I don't know this term query hijack can you put it in different words?

Thanks,

Eric

At 03:54 PM 2002-06-12 +0100, you wrote:
Do put the user_id into the query session and check it against the
user_id in the User session to prevent query hijack





RE: mod_perl/passing session information (MVC related, maybe...)

2002-06-12 Thread Jeff AA


 From: Eric Frazier [mailto:[EMAIL PROTECTED]] 
 Sent: 12 June 2002 16:52

 I don't know this term query hijack can you put it in different
words?

Lets say your user who is the boss makes a query
  'show me everyone's salary'

and your system checks who he is, and because he is the boss, allocates
query_id 1, issues the query and sends back page one with everyone's
salary details.


now some other user in the system can now say
  /query?query_id=1

and hijack the query results - i.e. they will see the results of the
query, even though they should not be allowed to.


If your security model is user centric, at a minimum you should put the
user_id inside the query_id session, and only let the same user get the
results from the saved query parameters. A better approach is to have
the query ALWAYS authenticate the current user, then you won't ever give
out data to the wrong person, and users can share query links that will
work if they have the appropriate rights.


from www.dictionary.com/searchq=hijack

hijack

n : seizure of a vehicle in transit either to rob it or divert it to an
alternate destination [syn: highjack] v : take arbitrarily or by force;
The Cubans commandeered the plane and flew it to Miami [syn:
commandeer, highjack, pirate, expropriate]



Regards
Jeff





Re: mod_perl/passing session information (MVC related, maybe...)

2002-06-12 Thread James G Smith

John Siracusa [EMAIL PROTECTED] wrote:
On 6/12/02 12:17 PM, Perrin Harkins wrote:
 James G Smith wrote:
 The nice thing about the context then is that customers can have
 multiple ones for multiple windows and they can have more than they
 have windows.
 
 How do you tie a context to a window?  I don't see any reliable way to
 do it.  The only way to maintain state for a window (as opposed to
 global state for a session) is to pass ALL the state data on every link.

Nah, you could just shove a context param into all forms and links on each
page, and store the actually (possibly large) context server-side, keyed by
context id (and session id, see below)

a href=/foo/bar?context_id=2.../a
...
input type=hidden name=context_id value=2
...

Note the tiny context id.  If you lookup contexts using both the context id
and the (cookie-stored) session id, you can get really short context ids :)
Just an idea...

I haven't worked this part out yet, though that is one way I thought
of.  This is similar to how Twig handles contexts.

Another way I was thinking about was making it part of the URL.  For
example:

  https://x.y.z.edu/contextid/rest/of/url.html

The session would be with a cookie.  This would allow cutting and
pasting of URLs for help tickets and such while preserving the
context.  This would also make coding easier by using relative URLs.

Of course, this has all the problems of storing the session ID in the
URL in the same manner.  We might also have to look for links that
open a new browser window and give them a new context.

I'm still working out the details.

I could be really evil and make the URLs 32-hex strings that map to a
context and URL combination :)  Obfuscated web site with no hope of
deep linking
-- 
James Smith [EMAIL PROTECTED], 979-862-3725
Texas AM CIS Operating Systems Group, Unix



RE: mod_perl/passing session information (MVC related, maybe...)

2002-06-12 Thread Rob Nagler

Jeff AA writes:
 An advantage of the session/id is that you end up with stateful query
 instances,

Stateful instances are also problematic.  You have essentially two
paths through the code: first time and subsequent time.  If you write
the code statelessly, there is only one path.  Fewer bugs, smaller
code, less development.

Sessions are caches.  Add them only when you know you need them.

 and can remember [at least for a short period!] the total
 number of items, so that you can say 'Results 1 to 10 of 34,566' without
 having to count all results every time.

Maybe this is just because we are using Oracle, but if you do a query:

SELECT count(*) FROM bla, bla...

followed up by:

SELECT field1, field2, ... FROM bla, bla...

Oracle will cache the query compilation and results so it is very fast
(basically a round-trip to database server) for the second query.
We execute these two queries on every paged list on every request.

One of the advantages of a declarative OR mapping is that you can do
things like sort to select asfields and order queries consistently.
Oracle takes advantage of this.  I don't know of mySQL or Postgres do,
too, but they probably will someday.

It's a bit slow (seconds) with Oracle's Context engine, which we've
been considering replacing.  Most of our queries are not text searches
iwc Oracle queries take less than 20ms per query.

We're not a large site (peak 50K views/day), and we have enough
hardware (two front ends, two middle tier, one db).  Our smaller sites
(e.g. bivio.biz) run on minimal hardware and use Postgres.  They use
the same code, and it seems to work fine.

Rob





Re: mod_perl/passing session information (MVC related, maybe...)

2002-06-12 Thread Per Einar Ellefsen

At 18:20 12.06.2002, John Siracusa wrote:
On 6/12/02 12:17 PM, Perrin Harkins wrote:
  James G Smith wrote:
  The nice thing about the context then is that customers can have
  multiple ones for multiple windows and they can have more than they
  have windows.
 
  How do you tie a context to a window?  I don't see any reliable way to
  do it.  The only way to maintain state for a window (as opposed to
  global state for a session) is to pass ALL the state data on every link.

Nah, you could just shove a context param into all forms and links on each
page, and store the actually (possibly large) context server-side, keyed by
context id (and session id, see below)

But what if someone opens one of the links in a different window, and 
continue on the same pages as in the original window, but with different 
parameters? The session ID would be the same, the context id would be the 
same, but the params would be different, right?


-- 
Per Einar Ellefsen
[EMAIL PROTECTED]





Re: mod_perl/passing session information (MVC related, maybe...)

2002-06-12 Thread John Siracusa

On 6/12/02 12:57 PM, Per Einar Ellefsen wrote:
 But what if someone opens one of the links in a different window, and
 continue on the same pages as in the original window, but with different
 parameters? The session ID would be the same, the context id would be the
 same, but the params would be different, right?

Well, then things break I guess... :)  Maybe you could do some magic based
on what browsers send as the referrer when users explicitly open a link in a
new tab or window?  Probably not worth it...

-John




Re: mod_perl/passing session information (MVC related, maybe...)

2002-06-12 Thread Perrin Harkins

John Siracusa wrote:
 On 6/12/02 12:57 PM, Per Einar Ellefsen wrote:
 
But what if someone opens one of the links in a different window, and
continue on the same pages as in the original window, but with different
parameters? The session ID would be the same, the context id would be the
same, but the params would be different, right?
 
 
 Well, then things break I guess... :)  Maybe you could do some magic based
 on what browsers send as the referrer when users explicitly open a link in a
 new tab or window?  Probably not worth it...

Right, which is why you shouldn't try to store server-side state for 
anything that could be different in multiple browser windows.  Only 
store global browser information on the server-side.  Everything else 
has to go into the links and forms.

- Perrin




RE: mod_perl/passing session information (MVC related, maybe...)

2002-06-12 Thread Jeff AA


 From: Perrin Harkins [mailto:[EMAIL PROTECTED]] 
 Sent: 12 June 2002 16:29

 Agreed, but he wasn't talking about storing the results, just the
query
 parameters and current offset / number of rows, which is a-ok for
 putting into a session.

 No, that's exactly what ISN'T okay for putting into a session.  If a 
 user opens two browser windows, does a search in each, and then pages 
 forward in each set of results, he will get completely wrong pages if 
 you do this.  The query parameters from the first search will be
written 
 over and lost.

Please - s/session/Apache::Session/g above


 You could do that, with a unique ID for each set of parameters, but
you 
 might as well just put the parameters right in the link unless they're

 very long.

The [Apache::]session approach makes it easy to store and change lots of
params to the query. It also lets you keep track of [recommendedly]
minimal info about the Query on the server, without having to re-execute
it, and it lets you pick up a previous query, with minor tweaks things
like /query?query_id=12345order=value+desc where the tweak doesn't get
lost in the params.


 Don't mix transient query sessions with a User Session that stores
info
 about the user's logged in state etc. It would be normal for one user
to
 have multiple queries in a login session
 
 Hold on, I think we actually agree, but you're using the word session 
 for a bunch of different things.  What you're saying here sounds like 
 the opposite of what you said above.  In common usage, a session is
the 
 state of the user's interaction with the application.  A cache of
query 
 data would be something else.

Again, please s/session/Apache::Session/g 

 MySQL is fast, but usually not as fast as simple disk access. 
 Cache::Cache and Cache::Mmap handle the details of the cache stuff for

 you, making it pretty easy.

RANT

I do agree that disk access _can_ be faster, but disagree with the
implication that caching DB results outside the db is a cool trick. I
would assert that in all general circumstances caching DB results is a
Common Mistake. Special circumstances do exist, but in my experience
very rarely, and that's why we have MI6. I can imagine a circumstance
where a cache may prove useful - a large number of concurrent users, all
wanting exactly the same data, slow db connection, non-optimisable
query. This doesn't seem to be the case here where the question was
about a faster Apache::Session.

Interestingly MySQL and other DBs are often as fast as simple disk
access - contrary to popular wisdom, most DB engines actually cache in
memory, with more data access information and hence effective cache
memory usage than is available to external cache components. Yes,
Network transference can be an issue - but hey! be a masochist, buy a
switch!

I recall an impressive chap at a bank, who was asked to address
performance issues. He immediately identified DB queries as taking far
too long, and proceeded to hand craft a mega-smart shareable multi-user
in-memory cache server in C. He ran into dozens of issues, which were
ingeniously solved using the tersest possible sin tax. After about six
months of effort, the performance problem still existed,  - the DB
resided entirely in memory anyway! A tweak of the schema [i.e. about 2
hours including test and release] by a DB admin took the problematic
process from 2 hours down to 120 seconds. We spent cash for cache, and
lived to rue the day.

I parse 'use a cache for db stuff' as 'my XYZ cache component is way
smarter than all the guys at 'Oracle|Sybase|MySQL' combined', or 'I know
my data better than the database, cos I'm a kewl Koder'. Actually, I
really parse 'use a cache for db stuff' as 'I don't really understand
databases, 3NF and indexing, and can't be bothered learning to use them
well'.

/RANT

But ok then, use a cache for your mod_perl query parameters, but don't
call it an [Apache::]Session.

8-)





Re: mod_perl/passing session information (MVC related, maybe...)

2002-06-12 Thread Perrin Harkins

Rob Nagler wrote:
 Stateful instances are also problematic.  You have essentially two
 paths through the code: first time and subsequent time.  If you write
 the code statelessly, there is only one path.  Fewer bugs, smaller
 code, less development.

I find you can tie this cache stuff up inside of your data access 
objects and make it all transparent to the other code.  That worked 
really well for me.  There are hooks for this in some of the O/R mapping 
modules on CPAN.

 Sessions are caches.

One of the things Java programmers often do wrong is cache general data 
in the session, because the servlet API makes it so easy to do.  But 
most data that people cache (as we're seeing in this discussion about 
search params) is not user-specific and thus doesn't belong in the 
session (i.e. everyone who searches for foosball gets the same result).

A session is useful for very limited things, like remembering if this 
user is logged in and linking him to a user_id.  Almost everything else 
belongs either in separate database tables or in the query args passed 
on each page.

 Oracle will cache the query compilation and results so it is very fast
 (basically a round-trip to database server) for the second query.
 We execute these two queries on every paged list on every request.

Although Oracle can be fast, some data models and application 
requirements make it hard to do live queries every time and still have 
decent performance.  This is especially true as traffic starts to climb. 
  That's when you can add in some caching and take a lot of stress off 
the database.  There are a million ways to implement caching, from 
denormalized tables to replicated databases to BerkeleyDB to mod_proxy 
and most web applications have some data that is read-only or close to 
it.  (I know that yours deals with financial data, so in your case it 
may actually have to be all real-time data.)

- Perrin




Re: mod_perl/passing session information (MVC related, maybe...)

2002-06-12 Thread Perrin Harkins

Jeff AA wrote:
 Interestingly MySQL and other DBs are often as fast as simple disk
 access - contrary to popular wisdom, most DB engines actually cache in
 memory, with more data access information and hence effective cache
 memory usage than is available to external cache components. Yes,
 Network transference can be an issue - but hey! be a masochist, buy a
 switch!

It's a simple rule: if you do less work, you will finish faster. 
Reading a file will go to the file system code in the kernel, which uses 
some sort of in-memory cache on any modern OS.  That means that for any 
frequent access data you are reading it from memory using system-level 
calls.  By contrast, MySQL has to deal with network transfers and SQL 
parsing before it reaches that stage.  It's not a huge difference, but 
it is a difference.  I'll have numbers on this stuff soon as part of my 
article on data sharing with mod_perl, so that people can compare and 
see if it's worth the effort for them.

The more important reason to cache is scalability.  Every time you don't 
hit the database, that means more resources are available to handle the 
queries that can't be cached.  On a site with heavy traffic, that's very 
important.

 I parse 'use a cache for db stuff' as 'my XYZ cache component is way
 smarter than all the guys at 'Oracle|Sybase|MySQL' combined', or 'I know
 my data better than the database, cos I'm a kewl Koder'. Actually, I
 really parse 'use a cache for db stuff' as 'I don't really understand
 databases, 3NF and indexing, and can't be bothered learning to use them
 well'.

I've worked with some good DBAs, but there is a limit to what they can 
do.  Ultimately, a database is designed to always give 100% correct 
up-to-date results, but in most web applications people would prefer to 
get slightly out-of-date results if they can get them much faster. 
Databases don't know how to do that.  Why should you go to MySQL every 
time someone hits the front page of Slashdot just to give them the very 
latest count on comments?  Caching that page for 1 minute takes a ton of 
load off the database and doesn't really impact the user experience.

I fully agree that optimizing the database and SQL is the first step, 
but correct use of caching can make a huge difference on high-volume sites.

- Perrin




Re: mod_perl/passing session information (MVC related, maybe...)

2002-06-12 Thread Issac Goldstand

- Original Message -
From: John Siracusa [EMAIL PROTECTED]
To: Mod Perl Mailing List [EMAIL PROTECTED]
Sent: Wednesday, June 12, 2002 8:06 PM
Subject: Re: mod_perl/passing session information (MVC related, maybe...)


 On 6/12/02 12:57 PM, Per Einar Ellefsen wrote:
  But what if someone opens one of the links in a different window, and
  continue on the same pages as in the original window, but with different
  parameters? The session ID would be the same, the context id would be
the
  same, but the params would be different, right?

 Well, then things break I guess... :)  Maybe you could do some magic based
 on what browsers send as the referrer when users explicitly open a link in
a
 new tab or window?  Probably not worth it...

 -John

Wait a second!  But then what are you gaining out of your context ID?  If
you can't amke new contexts for new windows, when WILL you make contexts?
That's coming back to the original problem of session IDs, isn't it?
  Issac




Re: mod_perl/passing session information (MVC related, maybe...)

2002-06-12 Thread Rob Nagler

Perrin Harkins writes:
 I find you can tie this cache stuff up inside of your data access 
 objects and make it all transparent to the other code.

Absolutely.

 A session is useful for very limited things, like remembering if this 
 user is logged in and linking him to a user_id.

We store this information in the cookie.  I don't see how it could be
otherwise.  It's the browser that maintains the login state.

Consider the following scenario:

* User logs in.
* Site Admin decides to delete the user.
* In our stateless servers, the user_id is invalidated immediately.
* Next request from User, he's implicitly logged out, because the user_id
  is verified on every request.

In the case of a session-based server, you have to delete the user and
invalidate any sessions which the user owns.

 Although Oracle can be fast, some data models and application 
 requirements make it hard to do live queries every time and still have 
 decent performance.  This is especially true as traffic starts to
 climb.

I've tried to put numbers on some of this.  I've never worked on a
1M/day site, so I don't know if this is the point where you need
sessions.  What sites other than etoys needs this type of session
caching?

Rob