Hi Paulo, Thanks for the comments. I am always looking for ways to improve my code.
With regards to the parser overall, initial testing on my system processed a 9.5mb RTF file in ~29 seconds. Through optimizations, it now only takes ~13 seconds. The HashMap is initialized using a value large enough to hold all the keys with 10% free slots. The initial sizing of the HashMap ensures it doesn't have to rebuild itself during initialization, so there's no reallocation of memory occuring. The HashMap object consumes ~8k of memory. Here are the statistics for loading the RtfCtrlWordMgr object which includes initializing and loading the HashMap. I ran it with a non-static and static HashMap object. All times end up approximately the same on my system. =Non Static HashMap======================== RtfCtrlWordMgr start date: Dec 5, 2007 9:31:09 AM RtfCtrlWordMgr end date : Dec 5, 2007 9:31:09 AM Elapsed time : 141 milliseconds. Begin Constructor RtfCtrlWordMgr , free mem is 1,321k End Constructor RtfCtrlWordMgr , free mem is 1,166k RtfCtrlWordMgr used approximately 155k ======================================== =Static HashMap=========================== RtfCtrlWordMgr start date: Dec 5, 2007 9:32:12 AM RtfCtrlWordMgr end date : Dec 5, 2007 9:32:12 AM Elapsed time : 157 milliseconds. Begin Constructor RtfCtrlWordMgr , free mem is 1,324k End Constructor RtfCtrlWordMgr , free mem is 1,169k RtfCtrlWordMgr used approximately 155k ======================================= Ultimately, there may be a way to perform lazy loading of the classes. I'll keep that in mind as I work through them. 99% of the class are not implemented at this time so at this time the functionality is limited to duplicating the old process. Additionally, there are not thousands of elements. If I remember correctly, there are 1810 control words. Each control word does it own special function and some control words perform multiple functions depending on the state of the document. Each extended control word class will eventually end up doing it's own processing. So I'm not quite sure they can be collapsed into a generic class. I will however continue to look for ways to improve the code! Howard ----- Original Message ---- From: Paulo Soares <[EMAIL PROTECTED]> To: Post all your questions about iText here <[EMAIL PROTECTED]> Sent: Wednesday, December 5, 2007 5:56:38 AM Subject: Re: [iText-questions] RTF Parser update I had a very quick look at the thousands of new classes created and I have a few remarks: - it looks like 99% of those classes could be eliminated and a generic class with some parameters be created when loading the hash - each time a parser is created the hash must be filled with thousands of elements with only a few being actually used. If it's not possible to have the hash as a static object at least it could be filled dynamically as needed, saving time and memory Paulo > -----Original Message----- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] On > Behalf Of Howard Shank > Sent: Tuesday, December 04, 2007 9:48 PM > To: Post all your questions about iText here > Subject: [iText-questions] RTF Parser update > > Hello everyone, > > Just a quick note to let everyone know there was a pretty big > update the the RtfParser today. Mark Hall was gracious enough > to review and accept the changes and added the update to the > repository today. > > New parser features: > Correctly parses all control words, parameters and data. > Uses BufferedReader for faster processing of input. > New control word "wiring" architecture allows for easier > implementation of control words. > Control Words defined in this update are from the RTF > Specification 1.9. (Does not include some application > specific extensions) > > This update includes a rewrite of the parser and lots of new > "wiring" for handling the 1800+ RTF control words. The source > update size is approximately 12.5mb. > > The import functionality should work exactly as before and > does not require any changes to existing code using the > RtfWriter2. If you encounter any issues, please post a > description of the issue here, with a sample RTF file if > possible, and I will follow up as quickly as I can. > > Further enhancements I am working on include: > Handling info group data. i.e. author, subject, title, etc. > Handling stylesheet mapping. > Handling list table mapping. > and more... > > Regards, > Howard Shank Aviso Legal: Esta mensagem é destinada exclusivamente ao destinatário. Pode conter informação confidencial ou legalmente protegida. A incorrecta transmissão desta mensagem não significa a perca de confidencialidade. Se esta mensagem for recebida por engano, por favor envie-a de volta para o remetente e apague-a do seu sistema de imediato. É proibido a qualquer pessoa que não o destinatário de usar, revelar ou distribuir qualquer parte desta mensagem. Disclaimer: This message is destined exclusively to the intended receiver. It may contain confidential or legally protected information. The incorrect transmission of this message does not mean the loss of its confidentiality. If this message is received by mistake, please send it back to the sender and delete it from your system immediately. It is forbidden to any person who is not the intended receiver to use, distribute or copy any part of this message. ____________________________________________________________________________________ Looking for last minute shopping deals? Find them fast with Yahoo! Search. http://tools.search.yahoo.com/newsearch/category.php?category=shopping ------------------------------------------------------------------------- SF.Net email is sponsored by: The Future of Linux Business White Paper from Novell. From the desktop to the data center, Linux is going mainstream. Let it simplify your IT future. http://altfarm.mediaplex.com/ad/ck/8857-50307-18918-4 _______________________________________________ iText-questions mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/itext-questions Buy the iText book: http://itext.ugent.be/itext-in-action/
