RE: WildcardMatcherHelper caching issue

2007-01-08 Thread Bruno Dumon
Ard,

What is cached is the pattern, not the string to be matched against it,
so what you describe isn't a problem IIUC.

On Mon, 2007-01-08 at 10:30 +0100, Ard Schrijvers wrote:
 Hello,
 
 think I kind of missed this WildcardMatcherHelper untill now. From which 
 cocoon version on is this available? Can you define in your matcher wether it 
 should use this WildcardMatcherHelper, or is this by default?
 
 Regarding the caching, currently it would seem to me like a very possible 
 memory leak. What if I have something like
 
 map:part element=othermatcher 
 value=cocoon://foo/{date:MMddHHmmssSS}/
 
 or if you have an active forum build with cforms, and 
 2ervw3verv452345435wdfwfw.continue patterns are cached (or is it only for 
 caching pipelines?)
 
 This would imply a new cached pattern for every request. Of course, the thing 
 above with the date is stupid, but it is too easy to  create memory leaks for 
 a user. The solution that a user should choose between caching or noncaching 
 WildcardMatcherHelper seems to me to difficult for an average user to make a 
 judgement on this. The option about a WeakHashMap should be some sort of 
 SoftHashMap (SoftRef) instead. WeakReferences are deleted when no longer a 
 strong ref is available, so either there would be a strong ref (implying the 
 same memory leak) or there whould be no strong ref, so all cached patterns 
 are removed on every gc. With SoftReferences they are only removed when jvm 
 decides to do so (when low on memory). But, IMO, it is not ok to have the jvm 
 possibly go low on memory, and the jvm to remove cached patterns at random 
 (more sense it makes, to have the most used patterns kept in memory). 
 
 I really think the best way is some simple LRUMemoryStore with a maxitems 
 configured by default to 1000 or something, and possibly overridden for the 
 user who knows more about it. Default, every user can easily work with it 
 without having to think about it. 
 
 Regards Ard

-- 
Bruno Dumon http://outerthought.org/
Outerthought - Open Source, Java  XML Competence Support Center
[EMAIL PROTECTED]  [EMAIL PROTECTED]



RE: WildcardMatcherHelper caching issue

2007-01-08 Thread Ard Schrijvers

 Ard,
 
 What is cached is the pattern, not the string to be matched 
 against it,
 so what you describe isn't a problem IIUC.

I already was a little amazed. But then, who is using always-changing patterns? 
Like in dynamic sitemaps or something? Do not really see this possible memory 
leak in here..

Ard

 
 On Mon, 2007-01-08 at 10:30 +0100, Ard Schrijvers wrote:
  Hello,
  
  think I kind of missed this WildcardMatcherHelper untill 
 now. From which cocoon version on is this available? Can you 
 define in your matcher wether it should use this 
 WildcardMatcherHelper, or is this by default?
  
  Regarding the caching, currently it would seem to me like a 
 very possible memory leak. What if I have something like
  
  map:part element=othermatcher 
 value=cocoon://foo/{date:MMddHHmmssSS}/
  
  or if you have an active forum build with cforms, and 
 2ervw3verv452345435wdfwfw.continue patterns are cached (or is 
 it only for caching pipelines?)
  
  This would imply a new cached pattern for every request. Of 
 course, the thing above with the date is stupid, but it is 
 too easy to  create memory leaks for a user. The solution 
 that a user should choose between caching or noncaching 
 WildcardMatcherHelper seems to me to difficult for an average 
 user to make a judgement on this. The option about a 
 WeakHashMap should be some sort of SoftHashMap (SoftRef) 
 instead. WeakReferences are deleted when no longer a strong 
 ref is available, so either there would be a strong ref 
 (implying the same memory leak) or there whould be no strong 
 ref, so all cached patterns are removed on every gc. With 
 SoftReferences they are only removed when jvm decides to do 
 so (when low on memory). But, IMO, it is not ok to have the 
 jvm possibly go low on memory, and the jvm to remove cached 
 patterns at random (more sense it makes, to have the most 
 used patterns kept in memory). 
  
  I really think the best way is some simple LRUMemoryStore 
 with a maxitems configured by default to 1000 or something, 
 and possibly overridden for the user who knows more about it. 
 Default, every user can easily work with it without having to 
 think about it. 
  
  Regards Ard
 
 -- 
 Bruno Dumon http://outerthought.org/
 Outerthought - Open Source, Java  XML Competence Support Center
 [EMAIL PROTECTED]  [EMAIL PROTECTED]
 
 


RE: WildcardMatcherHelper caching issue

2007-01-08 Thread Bruno Dumon
On Mon, 2007-01-08 at 11:52 +0100, Ard Schrijvers wrote:
  Ard,
  
  What is cached is the pattern, not the string to be matched 
  against it,
  so what you describe isn't a problem IIUC.
 
 I already was a little amazed. But then, who is using always-changing
 patterns? Like in dynamic sitemaps or something? Do not really see
 this possible memory leak in here..
 

There is no real issue here for Cocoon indeed, I was just nitpicking
that I prefered a design whereby the user of the wildcard matcher is
responsible for the caching, just as is the case for regexp's or XSLT's
or whatever.

-- 
Bruno Dumon http://outerthought.org/
Outerthought - Open Source, Java  XML Competence Support Center
[EMAIL PROTECTED]  [EMAIL PROTECTED]



Re: WildcardMatcherHelper caching issue

2007-01-07 Thread Bruno Dumon
On Sat, 2007-01-06 at 01:03 +0100, Alfred Nathaniel wrote:
 On Fri, 2007-01-05 at 13:45 +0100, Bruno Dumon wrote:
  Hi,
  
  I noticed the new WildcardMatcherHelper class holds an internal static
  map for caching. In the older solution, it was up to the caller to cache
  the compiled pattern (similar to how regexp libraries work). This had
  the advantage that the caller itself can decide whether the pattern
  should be cached. It also avoids a potential memory leak if this code is
  used to evaluate always-changing patterns, and avoids the need to do
  hashmap lookups.
  
  So I'm wondering if anyone would mind if I change it back so that caller
  caches the pattern?
  
  Thanks for any input.
 
 The integrated cache is a convenience for the many client who repeated
 match the same pattern and gain performance without having to code their
 own cache management.
 
 If you have an application where you will be matching a lot of one-shot
 patterns, you could add
 
public static Map match(String pat, String str, Map cache)
 
 which can be called with a null Map to by-pass caching.  The old
 signature then becomes simply
 
public static Map match(String pat, String str) {
return match(pat, str, cache);
}
 
 The built-in cache could also use a WeakHashMap to avoid ever-increasing
 memory consumption.

Thanks for the info.

I don't actually have an immediate use for one-shot patterns, however
I'm using the wildcard matcher and noticed the change. I thought the
compiled patterns were usually just kept in instance variables, hardly
deserving to be called cache management, though I must admit I didn't
study how it is done everywhere. Having this cache inside the
WildcardMatcherHelper seemed like a step back to me. But if needed
non-caching behaviour can indeed be added later again while keeping the
convenience of the default cache.

-- 
Bruno Dumon http://outerthought.org/
Outerthought - Open Source, Java  XML Competence Support Center
[EMAIL PROTECTED]  [EMAIL PROTECTED]



Re: WildcardMatcherHelper caching issue

2007-01-05 Thread Alfred Nathaniel
On Fri, 2007-01-05 at 13:45 +0100, Bruno Dumon wrote:
 Hi,
 
 I noticed the new WildcardMatcherHelper class holds an internal static
 map for caching. In the older solution, it was up to the caller to cache
 the compiled pattern (similar to how regexp libraries work). This had
 the advantage that the caller itself can decide whether the pattern
 should be cached. It also avoids a potential memory leak if this code is
 used to evaluate always-changing patterns, and avoids the need to do
 hashmap lookups.
 
 So I'm wondering if anyone would mind if I change it back so that caller
 caches the pattern?
 
 Thanks for any input.

The integrated cache is a convenience for the many client who repeated
match the same pattern and gain performance without having to code their
own cache management.

If you have an application where you will be matching a lot of one-shot
patterns, you could add

   public static Map match(String pat, String str, Map cache)

which can be called with a null Map to by-pass caching.  The old
signature then becomes simply

   public static Map match(String pat, String str) {
   return match(pat, str, cache);
   }

The built-in cache could also use a WeakHashMap to avoid ever-increasing
memory consumption.

Cheers, Alfred.