Hello Pawan, Approach looks good, my only suggestion is to use batch processing only when we are dealing with large data set, as this method takes a longer time compared to the normal method specially on a distributed environment, which may negatively impact the performance.
Kind Regards, Chandan Khandelwal Senior Manager, Enterprise Software Development *HotWax Systems* *Enterprise open source experts* cell: +91-98934-81076 office: 0731-409-3684 http://www.hotwaxsystems.com On Fri, Jun 5, 2020 at 4:07 PM Pawan Verma <pawan.ve...@hotwaxsystems.com> wrote: > Thanks, Pritam and Scott for the discussion. > > I've created Jira OFBIZ-11789 for this improvement and also created a PR > with the proposed changes. > > I request everyone to review the PR and suggest your thought on this. > Thanks! > -- > Thanks & Regards > Pawan Verma > Technical Consultant > *HotWax Systems* > *Enterprise open source experts* > http://www.hotwaxsystems.com > > > On Mon, Jun 1, 2020 at 12:36 PM Pritam Kute <pritam.k...@hotwaxsystems.com > > > wrote: > > > Thanks Scott for your detailed explanation. > > > > The solution looks good to me too. My confusion was with why we are going > > to implement new method if we can achieve that using the current > > EntityQuery methods. > > > > +1 for adding queryBatchIterator() to EntityQuery. > > > > Kind Regards, > > -- > > Pritam Kute > > > > > > On Thu, May 28, 2020 at 6:32 AM Scott Gray <scott.g...@hotwaxsystems.com > > > > wrote: > > > > > Hi Pritam, > > > > > > I'm not sure about PostgreSQL or Derby but I know with MySQL that > using a > > > cursor doesn't really work. You have to set the result set to > > > TYPE_FORWARD_ONLY and CONCUR_READ_ONLY and also set the fetch size to > > > INTEGER.MIN_VALUE. Only then will the driver stream the results and > even > > > then, you may not execute any other SQL commands on the connection > until > > > you have fully read or closed the resultset. > > > > > > So if an EntityListIterator doesn't really conserve memory, then you > need > > > to take a paging query approach such as this: > > > EntityQuery query = > > EntityQuery.use(delegator).from("SomeTable").limit(100) > > > List results = null > > > while (!(results = query.queryList()).isEmpty()) { > > > for (value : results) { > > > // do something with each value > > > } > > > query.offset(query.getOffset() + query.getLimit()) > > > } > > > > > > Or with the proposed EntityBatchIterator: > > > Iterator query = > > > > > > > > > EntityQuery.use(delegator).from("SomeTable").limit(100).queryBatchIterator() > > > while (iterator.hasNext()) { > > > result = iterator.next() > > > // do something with each value > > > } > > > > > > I guess an alternative approach would be to implement something similar > > > within the EntityListIterator and perhaps a flag to turn it off or on > > > depending on which database is being used and how well it supports > > > iterating over results without loading the entire resultset into > memory. > > > > > > Regards > > > Scott > > > > > > > > > > > > On Sat, 23 May 2020 at 20:59, Pritam Kute < > pritam.k...@hotwaxsystems.com > > > > > > wrote: > > > > > > > Hello Pawan, > > > > > > > > I just had a look into the EntityQuery.queryIterator() method and > looks > > > > like we can achieve that by using fetchSize(), fowardOnly(), > > > > cursorScrollInsensitive(), cursorScrollSensitive() and offset() > methods > > > in > > > > EntityQuery class. Let me know if I am missing anything. > > > > > > > > It will be good if you can post a pseudo code or something here so > that > > > we > > > > could get an understanding of the exact design which you have in your > > > mind. > > > > > > > > Kind Regards, > > > > -- > > > > Pritam Kute > > > > > > > > > > > > On Thu, May 21, 2020 at 7:41 PM Pawan Verma < > > > pawan.ve...@hotwaxsystems.com > > > > > > > > > wrote: > > > > > > > > > Hello Devs, > > > > > > > > > > While working on the large database we have figured out that very > > large > > > > > queries consume all memory and crash ofbiz(because queryIterator() > > > > doesn't > > > > > really work, it's no different from queryList()) > > > > > > > > > > The EntityListIterator attempts to use a cursor to iterate over > large > > > > > result sets but in reality most databases do not give us a cursor > > > unless > > > > we > > > > > ask for it in a very specific way, and instead you get back the > full > > > > result > > > > > set and potentially consume a large amount of memory. For example, > > the > > > > > MySql details are here (ResultSet section): > > > > > > > > > > > > > > > > > > > > https://dev.mysql.com/doc/connector-j/5.1/en/connector-j-reference-implementation-notes.html > > > > > > > > > > To work around this we've built the EntityBatchIterator which > instead > > > > > basically just pages through results like you would do on a webpage > > > that > > > > > uses pagination. The iterator takes an EntityQuery and when next() > > is > > > > > called it grabs the first 500 records and returns the first record, > > > call > > > > > next again and you get the second record, once you've called next > 500 > > > > times > > > > > it runs the query again with an offset and grabs the next 500 and > so > > on > > > > > until there are no more records. > > > > > > > > > > The main downsides to this approach are: > > > > > 1. Same as when using limit/offset you want to be sure your results > > > will > > > > > come back in a consistent order to avoid accidentally skipping some > > > rows > > > > > and seeing other rows twice. > > > > > 2. Because the results are a combination of many queries rather > than > > a > > > > > single query, some of the data may change while you are paging > > through > > > > it. > > > > > i.e. if you were to sort by lastUpdatedTxStamp you may see some > rows > > > > twice > > > > > as they are updated by other transactions (this might be a good > thing > > > in > > > > > some cases). > > > > > > > > > > So, the main proposal is to introduce a new EntityBatchIterator for > > > large > > > > > queries. If we agree on the proposal I'll create a Jira and > provide > > a > > > > > patch for the community review. Thanks! > > > > > > > > > > -- > > > > > Thanks & Regards > > > > > Pawan Verma > > > > > Technical Consultant > > > > > *HotWax Systems* > > > > > *Enterprise open source experts* > > > > > http://www.hotwaxsystems.com > > > > > > > > > > > > > > >