> On Oct. 17, 2017, 11:25 p.m., Aihua Xu wrote:
> > standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
> > Lines 261 (patched)
> > <https://reviews.apache.org/r/63043/diff/1/?file=1857700#file1857700line262>
> >
> >     OK. I started to recall why we did that this way. I think we don't want 
> > to call query.close() in the subclass since when you return query.execute() 
> > actually DN doesn't get the result yet until you consume those results. 
> >     
> >     That's why we want the caller to be responsible to call close() 
> > explictly and the caller knows that it has consumed all the data. 
> >     
> >     The new implmentation will close the query object inside the subclass 
> > right?
> 
> Alexander Kolbasov wrote:
>     Originally Query wasn't Closeable or AutoCLoseable, so you couldn't use 
> it anyway. Now it is and it could be possible to use Query directly, but this 
> raises a problem of handling exceptions on close.
>     
>     Answering the second part of your question. The query now is closed 
> immediately after the use (in subclass that creates the query). Query results 
> can be used after query close, but I need to copy them into another 
> collection - that's why there are a bunch of changes like `return new 
> ArrayList<>(result)`.
> 
> Aihua Xu wrote:
>     I see. I believe the original solution is to avoid the memory pressure on 
> HMS. With your approach, we are prefetching all the data and save in a 
> collection, while the original approach delays the data retrieval until it's 
> needed. 
>     
>     Is that possible to preserve that logic but still let it close 
> automatically?
> 
> Alexander Kolbasov wrote:
>     Note that we are not copying the data itself in the new ArrayList, we are 
> copying pointers. So the only extra overhead is the newly allocated array 
> which is mostly a short-living object and not a big deal. The prefetches are 
> present in the old code - I didn't add any new ones.
>     
>     Doing what you suggest requires bigger code refactoring since 
> try-with-resource is based on try-blocks, so we need to significantly shuffle 
> the code around.

Maybe I didn't state clearly. When you call query.execute(), what I have 
noticed that DN actually didn't really execute the query. It will only return 
an object and wait until the result is getting consumed and DN will do the 
actual query against the database.

So for the following code, I remember query.execute() doesn't return any data 
for you at that time yet. Now if new ArrayList<> retrieves the data and it 
triggers the actual execution, then you are fetching all the data at a time 
(cause memory pressure); if it doesn't and if q.close() is called, then the 
List you returned to the parent is not accessible since the query is closed.

you will have either issue. That's what I remembered from the past. Can you 
check test failure TestHiveMetaTool.testExecuteJDOQL if it's related to this? 
Right now we are using directSQL access, so we may hide these DN issues from 
the tests.

private List sub() {

  try (queryWrapper q) {
  
     return new ArrayList<>((List<MPartition>) query.execute(dbName, tableName, 
partNameMatcher));
  }
}


- Aihua


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63043/#review188419
-----------------------------------------------------------


On Oct. 16, 2017, 6:37 p.m., Alexander Kolbasov wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/63043/
> -----------------------------------------------------------
> 
> (Updated Oct. 16, 2017, 6:37 p.m.)
> 
> 
> Review request for hive, Aihua Xu, Andrew Sherman, Alan Gates, Janaki 
> Lahorani, Sergio Pena, Sahil Takiar, and Vihang Karajgaonkar.
> 
> 
> Bugs: 17730
>     https://issues.apache.org/jira/browse/17730
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> HIVE-17730 Queries can be closed automatically
> 
> The goal of the patch is to replaces all curret uses of QueryWrapper with 
> try-with-resource construct that guarantees that the
> query is closed.
> 
> This change affects the scope of the open query, so in all cases where the 
> code was returning the list of query results, I had to copy the list to a new 
> ArrayList to allow access after query close.
> 
> 
> Diffs
> -----
> 
>   metastore/src/java/org/apache/hadoop/hive/metastore/tools/HiveMetaTool.java 
> 22e246f1c914532ed6c67151c261fa709a283ef2 
>   
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
>  ffb2abdf6294dbd470525ad888bd95feac7e7f06 
> 
> 
> Diff: https://reviews.apache.org/r/63043/diff/1/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Alexander Kolbasov
> 
>

Reply via email to