First of all, I like the new name ;) - Using a TreeSet with subsets is a great idea!
- I wouldn't use an extra class just to wrap a HashMap of PageStore. I would just put them into the plain session. But finally, this is just a matter of taste. I even think that this class lacks proper synchronization. Doesn't Terracotta complain about modifying an instance outside of a transaction?? - The way getPage(...) with versionNumber -1 is implemented isn't really nice. Too bad that LinkedHashMap doesn't maintain a pointer to the end of the list although it is a double linked one :-( would make that task much faster. Another possibility would be to make a TreeMap<PageKey,Integer> out of the current TreeSet (which is backed by a TreeMap anyway). The integer would be a counter that indicates the insertion order. One would therefore only have to iterate over a subMap() of the pages containing all PageKeys with the pageId in question (implemented but not tested yet - not gonna happen today). http://www.nabble.com/file/p18321680/OurTerracottaPageStore.java OurTerracottaPageStore.java I think we are quite close to something really cool ;) richardwilko wrote: > > I'm not sure if pages are always inserted in version number, for example, > if you go back to a previous page and start doing something on it again, > it will start inserting pages with a lower version number (i think anyway) > > I also have a modified version, it seems a bit simpler than yours and it > will work not matter when pages are inserted or deleted. It uses an > additional TreeSet on the PageKeys; using the TreeSet's ordering it is > easy to quickly find specific versions, or the highest ajax version for a > normal version, or to find if a version exists. > > I also added some debug code, and made the number of pages limit optional. > > See what you think > > http://www.nabble.com/file/p18318811/OurTerracottaPageStore.java > OurTerracottaPageStore.java > > Richard > > > > > Stefan Fußenegger wrote: >> >> Ok, i now used a LinkedHashMap and a limit of 1000 pages per PageMap. >> This should give sufficient protection and rarely happen. >> >> You were right with the -1 ajaxVersionNumber. I fixed that. >> >> I also fixed the reference to the highestAjaxVersion as there needs to be >> such a reference for each version, not only each pageId. There is now an >> additional HashMap. So finally this implemenation requires two HashMaps >> and a LinkedHashMap per PageMap. >> >> For this implementation, I assumed that pages are inserted in order >> (according to their versions). Could somebody confirm that? Otherwise, >> the map pointing to the highest ajaxVersion would need to be updated when >> the currently highest ajaxVersion is deleted due to an exceeded max pages >> limit (one would have to search for a lower ajaxVersion and point to that >> page). Otherwise, I'd say we are quite close to the DiskPageStore >> implementation (not being asynchronous and not implementing >> ISerializationAwarePageStore - which is only used for Wicket's session >> clustering, right?) >> >> regards >> >> http://www.nabble.com/file/p18318100/MyTerracottaPageStore.java >> MyTerracottaPageStore.java >> >> >> richardwilko wrote: >>> >>> Im still not sure about not limiting the number of pages to keep in >>> session, even DiskPageStore has some sort of limit, imo not having a >>> limit exposes us to the possibility of a single malicious user grinding >>> the system to a halt. Yes terracotta will persist it to disk if needs >>> be, but if that session is is current active use then it will be paging >>> to and from disk all the time. >>> >>> I would like to get the opinion of some other people about this. >>> >>> Also I don't see how the the -1 ajax version can work; in disk based >>> store it treats the -1 the same as in getPage, where it just looks for >>> the highest version, in our case it will construct a key with the -1 >>> value in it, i.e. it will only find the page where ajax version number >>> is -1. Since this cant happen, contains page wont work. We could >>> probably use the helper structure thing to simplify this though. >>> >>> Richard >>> >>> >>> >>> Stefan Fußenegger wrote: >>>> >>>> 1+2) well, it will only add pages as long as the session is alive. if a >>>> page isn't used frequently it will be moved to and later persisted by >>>> the TC server and finally GCed together with its session. therefore i >>>> don't think deleting old pages is necessary. or do you have a special >>>> use case where this could be problematic? maybe a bot crawling >>>> thousands of pages could generate tons of serialized page? but is this >>>> really a problem? >>>> >>>> 3) okay, didn't see that little piece of javadoc. I think an extra >>>> structure keeping track of most recent versions of pageIds could help >>>> to make these searches efficient. >>>> >>>> I changed my code: >>>> - one store per PageMapName, making deletes more efficient >>>> - version info stored for all pageIds (HashMap<Integer,VersionInfo>) >>>> where VersionInfo has a pointer to the most recent page and highest >>>> ajaxVersionNumber >>>> >>>> Comments? >>>> >>>> New file: (untested!) >>>> http://www.nabble.com/file/p18314611/MyTerracottaPageStore.java >>>> MyTerracottaPageStore.java >>>> >>>> >>>> >>>> richardwilko wrote: >>>>> >>>>> Hi Stefan, >>>>> >>>>> Looking through your code I see a couple of issues: >>>>> >>>>> 1) There is no limit on the number of pages stored in the pagemap, >>>>> pages could get added forever. I feel there needs to be a way to >>>>> limit the number of pages stored, with oldest ones discarded first. >>>>> This is how DiskPageStore works. >>>>> >>>>> 2) Following on from point 1, a HashMap does not keep insertion order >>>>> so it is not possible to remove the oldest ones easily. A simple >>>>> change to LinkedHashMap would solve this and make point 1 easy to >>>>> implement. However storing all the pagemaps together does mean that >>>>> the most recent pages from one pagemap could get removed due to high >>>>> use of another pagemap. In this case when the user goes back to the >>>>> other pagemap he/she will encounter an exception. >>>>> >>>>> 3) Your getPage code is not general enough; from the javadocs for >>>>> getPage in IPageStore: >>>>> * If ajaxVersionNumber is -1 and versionNumber is specified, the page >>>>> store must return the page with highest ajax version. >>>>> * If both versionNumber and ajaxVersioNumber are -1, the pagestore >>>>> must return last touched (saved) page version with given id. >>>>> Your method of constructing a key object wouldn't work in these >>>>> situations, as it would only find exact matches, and so getPage would >>>>> require iterating through the entire HashMap and looking at every >>>>> entry. >>>>> >>>>> This issue is the reason why I went for the nested structure I used. >>>>> I do agree that a single storage map would ideally be better, >>>>> especially as this make it easier to better manage the number of pages >>>>> stored, but i'm not sure if it is the most efficient method of storage >>>>> for the complex getPage requirements. By efficient i mean execution >>>>> time rather than memory usage. >>>>> >>>>> Thoughts? >>>>> >>>>> Richard >>>>> >>>>> >>>>> Stefan Fußenegger wrote: >>>>>> >>>>>> Hi Richard, >>>>>> >>>>>> I had a thorough look on your code. I have the following remarks: >>>>>> >>>>>> - yes, SerializedPage must be clustered and should therefore >>>>>> implement IClusterable (it is already Serializable, it should >>>>>> therefore be okay to change) >>>>>> - I found two problems with your implementation: >>>>>> 1) unbind() is called during invalidation of a session. >>>>>> getPageStore() will therefore result in a NPE as there is no >>>>>> WebRequest >>>>>> 2) according to the JavaDoc of >>>>>> DiskPageStore#removePage(SessionEntry, String, int) ("page id to >>>>>> remove or -1 if the whole pagemap should be removed") calling >>>>>> removePage(String, String, int) with an id of -1 should delete all >>>>>> pages of a pageMap (however, that's not documented in the JavaDoc of >>>>>> IPageStore!) >>>>>> - I feel that all pages could be in a single HashMap (rather than >>>>>> using 3 levels of nested HashMaps and HashSets). I therefore >>>>>> implemented my own PageStore based on your ideas to confirm my >>>>>> feelings (using a single HashMap per sesison, using less >>>>>> Hash(Map|Set) iterations; access synchronized using a >>>>>> ReentrantReadWriteLock which I think has quite good performance with >>>>>> TC). Please have a look. We can probably >>>>>> http://www.nabble.com/file/p18312624/MyTerracottaPageStore.java >>>>>> MyTerracottaPageStore.java merge our ideas for best results! >>>>>> >>>>>> Regards >>>>>> Stefan >>>>>> >>>>>> >>>>>> http://www.nabble.com/file/p18312624/MyTerracottaPageStore.java >>>>>> MyTerracottaPageStore.java >>>>>> >>>>>> >>>>> >>>>> >>>> >>>> >>> >>> >> >> > > ----- ------- Stefan Fußenegger http://talk-on-tech.blogspot.com // looking for a nicer domain ;) -- View this message in context: http://www.nabble.com/Terracotta-integration-tp18168616p18321680.html Sent from the Wicket - Dev mailing list archive at Nabble.com.
