[ 
https://issues.apache.org/jira/browse/OLINGO-1504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17742469#comment-17742469
 ] 

Daniel Fernández commented on OLINGO-1504:
------------------------------------------

I can confirm there are a series of inefficiencies in the way buffers are used 
for consuming OData response payloads in several Olingo classes (not only the 
one patched by the author of this ticket). This is affecting my team's 
applications producing very high memory usages when accessing OData services 
that have large EDMs (metadata documents), but also when consuming data 
responses (e.g. {{AbstractODataResponse.java}} is affected too).

The reason for these inefficiencies is a mix of initial default buffer sizes 
being too small and unneeded {{byte[]}} buffer copy operations like these lines 
in {{{}AbstractODataResponse.java{}}}:
{code:java}
      ByteArrayOutputStream byteArrayOutputStream = new ByteArrayOutputStream();
      try {
        // INEFFICIENT: Why copy here before checking whether inputContent is 
null?
        org.apache.commons.io.IOUtils.copy(payload, byteArrayOutputStream);
        if(inputContent == null){
          inputContent  = byteArrayOutputStream.toByteArray();
        }
        inputStream = new ByteArrayInputStream(inputContent);
        return inputStream;
      } catch (IOException e) {
        ...
      }
{code}
 

> JVM crashes due to OutOfMemory encountered: Java heap space 
> ------------------------------------------------------------
>
>                 Key: OLINGO-1504
>                 URL: https://issues.apache.org/jira/browse/OLINGO-1504
>             Project: Olingo
>          Issue Type: Bug
>          Components: odata4-client
>    Affects Versions: (Java) V4 4.7.0
>            Reporter: Devansh Soni
>            Priority: Major
>         Attachments: 
> 0001-OLINGO-1504-override-getRawResponse-method-in-ODataE.patch, 
> HeapDumpLargestObjects.png, JprofilerHeapWalkerGraph.png
>
>
> Hi 
>  The issue occurs for non-paginated OData feeds. The feed I had tested had 
> 100,000 rows and 9 columns. The JVM crashes due to insufficient heap size and 
> I can find the stack trace from the hs_err_pidPID log file. 
> {code:java}
> ID    Value
> 26    Java frames: (J=compiled Java code, j=interpreted, Vv=VM code)
> 27    j  java.util.Arrays.copyOf([BI)[B+1
> 28    j  java.io.ByteArrayOutputStream.grow(I)V+36
> 29    j  java.io.ByteArrayOutputStream.ensureCapacity(I)V+12
> 30    j  java.io.ByteArrayOutputStream.write([BII)V+38
> 31    j  
> org.apache.commons.io.IOUtils.copyLarge(Ljava/io/InputStream;Ljava/io/OutputStream;[B)J+19
> 32    j  
> org.apache.commons.io.IOUtils.copy(Ljava/io/InputStream;Ljava/io/OutputStream;I)J+5
> 33    j  
> org.apache.commons.io.IOUtils.copyLarge(Ljava/io/InputStream;Ljava/io/OutputStream;)J+5
> 34    j  
> org.apache.commons.io.IOUtils.copy(Ljava/io/InputStream;Ljava/io/OutputStream;)I+2
> 35    j  
> org.apache.olingo.client.core.communication.response.AbstractODataResponse.getRawResponse()Ljava/io/InputStream;+136
> 36    j  
> org.apache.olingo.client.core.communication.request.retrieve.ODataEntitySetIteratorRequestImpl$ODataEntitySetIteratorResponseImpl.getBody()Lorg/apache/olingo/client/api/domain/ClientEntitySetIterator;+23
> 37    j  
> org.apache.olingo.client.core.communication.request.retrieve.ODataEntitySetIteratorRequestImpl$ODataEntitySetIteratorResponseImpl.getBody()Ljava/lang/Object;+1
> 38    j  
> com.tableausoftware.odata.ODataProtocolImpl.fetchV4(Ljava/net/URI;Z)Lcom/tableausoftware/odata/ODataProtocolImpl$ODataResults;+50
> 39    j  
> com.tableausoftware.odata.ODataResultSetV4.nextBlockImpl()Lcom/tableausoftware/data/generated/DataStream$Block;+23
> 40    j  
> com.tableausoftware.data.ProtobufResultSet.nextBlock()Lcom/tableausoftware/data/generated/DataStream$Block;+1
> 41    j  com.tableau.connect.service.QueryTask.readData()V+46
> 42    j  com.tableau.connect.service.QueryTask.call()Ljava/lang/Void;+9
> 43    j  com.tableau.connect.service.QueryTask.call()Ljava/lang/Object;+1
> 44    j  java.util.concurrent.FutureTask.run()V+42
> 45    j  
> java.util.concurrent.ThreadPoolExecutor.runWorker(Ljava/util/concurrent/ThreadPoolExecutor$Worker;)V+95
> 46    j  java.util.concurrent.ThreadPoolExecutor$Worker.run()V+5
> 47    j  java.lang.Thread.run()V+11
> 48    v  ~StubRoutines::call_stub
> {code}
>  
> The issue happens because there are multiple copies of streams being created 
> in the 
> AbstractODataResponse.[getRawResponse|https://github.com/apache/olingo-odata4/blob/master/lib/client-core/src/main/java/org/apache/olingo/client/core/communication/response/AbstractODataResponse.java#L300]
>  method, specifically in the org.apache.commons.io.IOUtils.copy method. To me 
> it seems like it fails to expand the internal byte buffer when it reaches 
> capacity. 
>  However, I do not understand why is the response payload being copied in a 
> ByteArrayOutputStream 
> {noformat}
> org.apache.commons.io.IOUtils.copy(payload, byteArrayOutputStream);
> {noformat}
> and then again converted into a ByteArrayInputStream. This copying of streams 
> causes creation of multiple byte buffers which fills up the heap memory. 
> The 
> ODataEntitySetIteratorResponseImpl.[getBody|https://github.com/apache/olingo-odata4/blob/master/lib/client-core/src/main/java/org/apache/olingo/client/core/communication/request/retrieve/ODataEntitySetIteratorRequestImpl.java#L78]
>  call the getRawResponse() in the constructor call of 
> {noformat}
> entitySetIterator = new ClientEntitySetIterator<>(
>                 odataClient, getRawResponse(), 
> ContentType.parse(getContentType()));
>       }
> {noformat}
>  
> However the 
> [constructor|https://github.com/apache/olingo-odata4/blob/master/lib/client-api/src/main/java/org/apache/olingo/client/api/domain/ClientEntitySetIterator.java#L79]
>  of ClientEntitySetIterator accepts InputStream. 
> So I do not understand the reason behind conversion of the payload into 
> ByteArrayInputStream in the AbstractODataResponse. I am trying to figure the 
> reason why this was done versus returning the payload InputStream as-is. 
> For fixing the problem, we have tried increasing the Java heap size but it is 
> just a temporary solution since once the OData feed size increases further 
> beyond a limit, it will fail again.
> I also got heap dump for the Java crash and was able to visualize the largest 
> object byte[] in a jprofiler to reach the same conclusion as above. 
>   



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to