dfs 01/05/19 23:04:59
Modified: . TODO
Log:
Added reminder to reevaluate performance of input iteration via virtual
method against direct character array indexing and consider reintroducing
stream matching if it is acceptable using HotSpot.
Revision Changes Path
1.4 +23 -1 jakarta-oro/TODO
Index: TODO
===================================================================
RCS file: /home/cvs/jakarta-oro/TODO,v
retrieving revision 1.3
retrieving revision 1.4
diff -u -r1.3 -r1.4
--- TODO 2001/05/18 09:53:57 1.3
+++ TODO 2001/05/20 06:04:59 1.4
@@ -1,4 +1,4 @@
-$Id: TODO,v 1.3 2001/05/18 09:53:57 dfs Exp $
+$Id: TODO,v 1.4 2001/05/20 06:04:59 dfs Exp $
o Optimize/improve Unicode character classes.
@@ -22,3 +22,25 @@
o Make build.xml build the demonstration applet and integrate with
docs/ tree.
+
+o Measure performance of HotSpot iterating through match input via
+ an interface's virtual function versus direct character array indexing.
+ If HotSpot dynamically inlines the functions and achieves comparable
+ performance, provided a clear warning is indicated that performance
+ will be reduced on earlier JDK versions, create a generic interface
+ for representing input. CharStringPointer should be replaced with
+ a generic interface, PatternMatcherInput should be made to implement
+ the interface, and stream matching can be reintroduced.
+ Reintroduced stream matching should include a callback mechanism in the
+ interface to report when a "contains" match has been found to allow
+ the input encapsulator to trim its buffer. Strong warnings must go
+ into the documentation referencing the ACM paper and noting that for
+ many streams it will be more efficient to read the entire stream into
+ a buffer first rather than try to match incrementally because many
+ regular expressions will cause the whole stream to be read in anyway.
+ For situations where that is not the case we want to be able to trim
+ the buffer (there have been people who used OROMatcher to search
+ gigabyte length files!). Additional methods should be added to
+ regulate buffer growth behavior, whether to save all of it for reuse
+ in a future pass, etc.
+
\ No newline at end of file