On 4 Apr 2012, at 17:19, Morten Jorgensen <morten.jorgen...@openjawtech.com> wrote:
> Thanks again for your comments. More replies below: >>>> That's interesting. Can you share some details about how it works? >>> Sure. It is quite simple. Cassandra is effectively a multi-level >>> distributed hash-map, so it lends itself very well do storing session >>> attributes. >>> >>> The session manager maintains two column families (like tables), one to >>> hold session meta-data such as the last access timestamp, etc. and one >>> column family to hold session attributes. Storing or reading a session >>> attribute is simply a matter of writing it using the session ID as the >>> row ID, and the session attribute name as the column name, and the >>> session attribute value as the column value. >>> >>> Session attributes are read and written independently, so the entire web >>> session does not have to be loaded into memory - only the session >>> attributes that are actually required to service a request are read. >>> This greatly reduces the memory footprint of the web applications that I >>> am developing for my employer. >> I'd be concerned about how chatty that was. >> >> Devil's advocate question: why store data in the session if it's not needed? > Good question. For large web applications, and particularly web-based UIs > with multiple > user screens, you would have certain data in your session for the various > screens/pages. Not if I could avoid it, I wouldn't. I might have user data or refs that I need for each page, but everything else goes in request scope. > Not all pages need _all_ data in your session, and since the session manager > loads session > attributes only when the web app code asks for it, only the data that is > required for the > current page is loaded from Cassandra. >> >>> For improved performance I have added a write-through and a write-back >>> cache, implemented as servlet filters. The cache is flushed or written >>> back once the current request has finished processing. I am sure there >>> is room for improvement here, as multiple concurrent requests for the >>> same session should be served using the same cache instance. >> But... (more devil's advocating, sorry) while this should address the >> chattiness* problem, doesn't it mean that your solution is invasive and >> can't be really deployed without modifying an app? > The session manager works without this cache, but is slow. The cache is > configured > as a filter configured in web.xml. The code of a web app won't have to be > changed, > but you need to update your web.xml to use the session manager effectively. Adding a Filter means modifying the app. </pedant> >> * is that even a word? > Yes it is, according to dictionary.com >> >>> The Manager does not maintain any references to Session instances at >>> all, allowing them to be garbage collected at any time. This makes >>> things very simple, as Cassandra holds all session state, and the >>> session managers in my Tomcat nodes only act as a cache in front of >>> Cassandra. >>> >>> The nature of Cassandra and the Tomcat's implementation of web sessions >>> go together extremely well. I am surprised that nothing like this exists >>> already. It is a square hole, square peg sort of scenario. >> I'm not entirely sure I agree. >> >> Cassandra trades off consistency for availability and partition >> tolerance, whereas I'd suggest a session management solution would want >> to trade partition tolerance for consistency and availability. >> >> I'm also not sure that the comparison between column store and session >> attribute map stands up beyond the initial/apparent similarity between >> data type. >> >> Cassandra is write-optimised and hits disk (on at least two nodes for >> HA) for every write AFAIK. > Cassandra allows you choose your consistency level. I use a quorum write, > which > writes to (N/2)+1 Cassandra nodes, where the Cassandra ring contains N nodes. But as you say, you've discovered why this is slow for a webapp, and you have to add a cache to each request to fix it. I'd suggest you'd be better off just loading data into the request scope directly, rather than indirectly. > I think this makes sense for web session data, and my current implementation > has this consistency-level hard-coded. I think it would probably make sense to > allow this to be configured. >> >>> I also have an implementation of the Map interface that stores the >>> values of each entry as a session attribute. The way many developers >>> write web applications is to have a "session bean" (a session attribute) >>> that contains a Map that maintains the actual session attributes. This >>> is OK if the entire session is persisted as a whole, but it won't >>> perform very well with the Cassandra session manager (or the Delta >>> Session Manager from what I understand). A developer can replace their >>> session bean's HashMap with the SessionMap utility, and the session >>> attributes will be treated as proper session attributes by the session >>> manager. >> Is there not a way to do this internally& therefore transparently to >> the developer? Otherwise you're introducing more dependencies and >> creating more of a framework than a pluggable manager. > I don't think there is a clean way of doing this without overriding the > default Map > implementations of the JVM. But, I think storing session data as individual > session > attributes rather than large object hierarchies is good (but not common) > programming practice. It allows the session container/manager to manage > read/write operations of the session attributes separately. This practice > should > benefit not only my Cassandra session manager but also the existing Delta > manager. >>>> 1. Be relatively self-contained -- i.e. not require much in the way of >>>> changes to existing classes >>> There are no changes to existing classes. My session manager implements >>> the existing org.apache.catalina.Manager interface. >> Instead of the filter, could you use a Valve? > For the cache? The main reason why I use a filter is to be able to tie a cache > object to a thread-local variable for the period for which the request is > being > processed. As soon as the response is streamed to the client the cache is > released. > If Tomcat already contains some internal reference to the current request > then I > won't need to use a filter in this manner. It must do, right? A Valve is similar to a Filter but has access to the internal representations of the request/session so would mean you don't need to interfere with the app. > I am not a fan of thread-local variables, > so I'd very much like to remove the dependency on having this filter in place. > > Morten > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org > For additional commands, e-mail: dev-h...@tomcat.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org For additional commands, e-mail: dev-h...@tomcat.apache.org