Okay, I did some work on this purely as a learning experience for me. I wanted 
to learn SAX parsing, so I tried converting the screen widgets to SAX parsing.

I found a small public-domain framework that makes the whole process very easy. 
Since all of the model screen widgets subclass a single base class, I was able 
to hook them into the parsing framework by just having the base class subclass 
one of the framework classes. Model widgets that don't have sub-widgets just 
needed a new constructor. Model widgets that have sub-widgets needed a little 
extra code to handle the sub-widgets, but it was no more code than what already 
exists to handle the DOM version of the sub-widgets.

Overall, it was pretty easy and I was surprised when it worked the very first 
time I tried it.

If anyone is interested, I would be happy to post the POC code in Jira. Just 
let me know.

-Adrian



--- On Sat, 4/25/09, Adrian Crum <adrian.c...@yahoo.com> wrote:

> From: Adrian Crum <adrian.c...@yahoo.com>
> Subject: Re: Discussion: XML file parsing improvement
> To: dev@ofbiz.apache.org
> Date: Saturday, April 25, 2009, 10:28 PM
> Adam and David,
> 
> Thank you for your comments! I'll look into the entity
> import code some more.
> 
> Personally, I don't have an issue with importing large
> XML files. I see it come up from time to time on the mailing
> lists. I remember BJ Freeman had to write his own import
> code because of some OFBiz limitation.
> 
> I'll accept the widget files, scripts, and config files
> are too small to optimize. Having event-driven parsing for
> those might be an interesting experiment though.
> 
> -Adrian
> 
> 
> --- On Sat, 4/25/09, Adam Heath
> <doo...@brainfood.com> wrote:
> 
> > From: Adam Heath <doo...@brainfood.com>
> > Subject: Re: Discussion: XML file parsing improvement
> > To: dev@ofbiz.apache.org
> > Date: Saturday, April 25, 2009, 9:52 PM
> > Adrian Crum wrote:
> > > OFBiz uses a lot of XML files. When each XML file
> is
> > > read, it is first parsed into a DOM Document,
> then the
> > > DOM Document is parsed into OFBiz Java objects.
> This
> > > two-step process consumes a lot of memory, and it
> > > takes more time than it should.
> > > 
> > > There is an alternative - what is called
> event-driven
> > > parsing. The XML parser can be set up to convert
> XML
> > > elements directly to the OFBiz Java objects -
> > > bypassing the DOM Document build and parse steps.
> > > Theoretically, this could provide a huge
> performance
> > > boost, and it would use less memory. In addition,
> it
> > > would solve the problem of huge XML files maxing
> out
> > > server memory during the parse process - like
> with
> > entity XML
> > import/export.
> > > 
> > > Has anyone else considered this? Do you think it
> is
> > > worth pursuing?
> > 
> > What files are you talking about, that are so huge,
> they
> > can't be
> > parsed with the simpler DOM model?
> > 
> > entity data files are sax based already.
> > 
> > widget files, scripts, config files are small, so
> it's
> > better to keep
> > the simpler algo, as David suggested.
> > 
> > additionally, I already did some memory profiling a
> while
> > back, and
> > interned the long-lived strings from parsed xml.  This
> > actually
> > reduced memory usage.
> > 
> > Another thing, the widgets, scripts, config files are
> read
> > very
> > infrequently, then cached.  The time it takes to parse
> them
> > is not
> > really a performance consideration.
> > 
> > As an aside, how much swap do you have on your server?
> 
> > Any?  Is it
> > being used?  Then you don't have enough ram.  If
> your
> > work-load is
> > causing swap to be used, then you haven't
> correctly
> > identified your
> > work load usage requirements.
> > 
> > The same can be said for java maximum memory
> allocation.


      

Reply via email to