The most obvois is ... not to store data in memory and use the DB engine: You should limit your result sets and create of sets in your result.
So your app architecture should be improved to handle only a few hudred or thosand records, and then dip and get more as you need it. Tell me more about what are you trying to do? If it is like Google, you can just retrieve(from 3000 to 5000) and process a subset. Ex: look at variations of bellow ofset syntax: http://www.postgresql.org/idocs/index.php?sql-select.html Consider using the Disconected Row Set (on SourceForge in basicPortal and jxUtil) instead of CachedRowSet. Copying data is time consuming. So do not be afraid to lean on the db engine, they are quite fast and the cache themself. hth, v --- In [EMAIL PROTECTED], "Jeff Duska" <mvc-programmers- [EMAIL PROTECTED]> wrote: > I'm working with very large data sets. In my current problem, I have > about 74,000 records that need to be converted, filtered and stored in > Oracle database. The initial run will has the largest query results. > After this, the data will change only on a very limited basis. > > The source database is Microsoft Server SQL 7 and the destination > database is Oracle database running on a Solaris. I've been creating a > modest program in Java to do the conversion. > > The current design uses a Row Gateway pattern from > http://martinfowler.com/isa/index.html. This is basically a static class > that creates a collection of value objects. I've been surprised at how > fast this working. I have the following code in my main routine > > start = System.currentTimeMillis(); > CDPubsList pubs = new CDPubsList(); > List pubsList = pubs.getCDPubList(); > for (Iterator iter = pubsList.iterator(); iter.hasNext();) { > CDPub element = (CDPub) iter.next(); > System.out.println(element); > } > finish = System.currentTimeMillis(); > > It typically takes about only thirty seconds. I think some of the time > is cause by the System.out.println and my log4J debug statements. My > problem is that I ran out of memory in Eclipse. I fixed the problem by > upping memory in Eclipse. The issue is that I know that this is one of > my smaller data sets. > > I was going to replace this with a CachedRowSet, but I just read in the > JDBC API Tutorial and Reference, 2nd ed that > > "CachedRowSet - ... [is] not suitable for very large data sets..." > > I could switch my design to have the main loop read one record at a > time. I have several reservations about this design. First, it is that > not very object-oriented. The core code knows all about my database. > Since I have to create several of this conversation programs over the > next year, I wanted the core code to be like a simple framework or > harness that I could reuse over and over. I lose the reusable components > that I'm creating. I would either have to have the connection open all > the time or I would need to add a connection pool manger - such as DBCP > from Apache. > > I could grab portions of the data set, I think this would be called > paging, but I just can't see how to do this without requiring an > alteration to the source database schema. I could do this via view, but > I rather not having to do this. > > I figure I must be just missing something obvious. If you don't mind, I > was hoping that someone have some pointers... > > > _______________________________________________ > MVC-Programmers mailing list > [EMAIL PROTECTED] > http://www.netbean.net/mailman/listinfo/mvc-programmers ------------------------ Yahoo! Groups Sponsor ---------------------~--> 4 DVDs Free +s&p Join Now http://us.click.yahoo.com/pt6YBB/NXiEAA/Ey.GAA/NhFolB/TM ---------------------------------------------------------------------~-> To unsubscribe from this group, send an email to: [EMAIL PROTECTED] Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/ _______________________________________________ MVC-Programmers mailing list [EMAIL PROTECTED] http://www.netbean.net/mailman/listinfo/mvc-programmers