> On April 3, 2018, 11:34 p.m., Ashutosh Chauhan wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DbTxnManager.java
> > Line 842 (original)
> > <https://reviews.apache.org/r/66369/diff/1/?file=1990385#file1990385line850>
> >
> >     Should this instead be (queryPlan.hasAcidResourcesInQuery() || 
> > (queryPlan.isMVRebuild())
> >      or similar ?

This is not part of this patch, it was coming from patch used to mv the id list 
generation. I have removed this.


> On April 3, 2018, 11:34 p.m., Ashutosh Chauhan wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DbTxnManager.java
> > Lines 966 (patched)
> > <https://reviews.apache.org/r/66369/diff/1/?file=1990385#file1990385line972>
> >
> >     Unused function.

It is used in L1115.


> On April 3, 2018, 11:34 p.m., Ashutosh Chauhan wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DbTxnManager.java
> > Lines 1083 (patched)
> > <https://reviews.apache.org/r/66369/diff/1/?file=1990385#file1990385line1089>
> >
> >     LOG.warn()
> >     Also add in message that rebuild will fail now.

This may also happen when we succeed, I have just transform it into debug.


> On April 3, 2018, 11:34 p.m., Ashutosh Chauhan wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/lockmgr/HiveTxnManagerImpl.java
> > Lines 221 (patched)
> > <https://reviews.apache.org/r/66369/diff/1/?file=1990387#file1990387line221>
> >
> >     Can you add a comment on why it never actually locks.

// This is default implementation. Locking only works for incremental 
maintenance
// which only works for DB transactional manager, thus we cannot acquire a lock.


> On April 3, 2018, 11:34 p.m., Ashutosh Chauhan wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java
> > Lines 1352-1353 (original), 1376-1377 (patched)
> > <https://reviews.apache.org/r/66369/diff/1/?file=1990388#file1990388line1384>
> >
> >     Comment doesn't match code. 
> >     Should this be outdated = false?

Comment was outdated indeed, I have removed it. There is a comment just above 
that explains the logic. _outdated = true_ when _diff == 0L or we are 
rebuilding the MV_, and _the MV has been invalidated_, because in those cases 
it cannot be outdated at all.


> On April 3, 2018, 11:34 p.m., Ashutosh Chauhan wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java
> > Lines 1460 (patched)
> > <https://reviews.apache.org/r/66369/diff/1/?file=1990388#file1990388line1471>
> >
> >     This will update all MV before they are considered for rewrite. Looks 
> > like we are doing extra work which could be avoided by delaying augmenting 
> > MV until it qualifies for rewrite.

Rewriting algorithm needs those filters to produce correct rewriting, hence 
though we are indeed doing extra work, I do not think we can avoid it. This 
part is also useful when MVs are outdated (not only for rebuild), as they may 
still be used to produce correct rewritings.

On the other hand, we could avoid this work by implementing the pre-filtering 
techniques that we were talking about at some point. The idea would be that 
only views relevant for a given query will be returned by the method, then they 
would be enriched.

Another improvement is that for a rebuild operation, we only consider in the 
rewriting the MV that we are rebuilding? However, this is a kind of heuristic, 
since other MVs that are up-to-date may be useful to rewrite the rebuild query.


> On April 3, 2018, 11:34 p.m., Ashutosh Chauhan wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java
> > Lines 1522 (patched)
> > <https://reviews.apache.org/r/66369/diff/1/?file=1990388#file1990388line1533>
> >
> >     Is this guaranteed to be an acid table. Else, this may throw NPE when 
> > row_id is not present.

This should be an ACID table. _augment_ should not be called otherwise. I have 
added additional check on the invalidation information so we do not call the 
method.


> On April 3, 2018, 11:34 p.m., Ashutosh Chauhan wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/views/HiveAggregateIncrementalRewritingRule.java
> > Lines 57-60 (patched)
> > <https://reviews.apache.org/r/66369/diff/1/?file=1990390#file1990390line57>
> >
> >     Is this rewrite always better?
> >     
> >     Imagine tabA and tabB to be in millions rows but their join generates 
> > cross product of billions of rows. In prev case we were doing union of 
> > large MV with join of two tables (with which filter may get very small) as 
> > compared to here where we do ROJ of MV with one source table.

Comment may be misleading. In the last MERGE, _source_ refers to _source_ 
block, not to a single query:
_(SELECT a, b, SUM(x) AS s, COUNT(*) AS c --NEW DATA
 FROM TAB_A
 JOIN TAB_B ON (TAB_A.a = TAB_B.z)
 WHERE TAB_A.ROW_ID > 5
 GROUP BY a, b) source_

In any case, I think the point about performance is relevant. I expect if we do 
a rebuild with not many new rows, incremental maintenance will get the edge. 
However, if there are many changes in source tables, union may be benefitial 
over ROJ? But in the latter case, we also need to take into account that the 
UNION operation will invalidate the LLAP cache of the MV, while the ROJ variant 
will not. I can add a flag to control the incremental maintenance piece, even 
if it is always true, in case we want to disable it to do some tests in the 
future? What do you think?


> On April 3, 2018, 11:34 p.m., Ashutosh Chauhan wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/views/MaterializedViewRewritingRelVisitor.java
> > Lines 75 (patched)
> > <https://reviews.apache.org/r/66369/diff/1/?file=1990392#file1990392line75>
> >
> >     should this check inner join ?

Currently rewritings with outer join are not produced, hence check is not 
needed unless I am missing something?


> On April 3, 2018, 11:34 p.m., Ashutosh Chauhan wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/views/MaterializedViewRewritingRelVisitor.java
> > Lines 79 (patched)
> > <https://reviews.apache.org/r/66369/diff/1/?file=1990392#file1990392line79>
> >
> >     Doesnt it need to check agg functions which are supported : min, max, 
> > count, sum ?

If aggregation function is not supported, we cannot generate the rewriting. 
Supported functions are identified in _HiveRelBuilder.getRollup_.


> On April 3, 2018, 11:34 p.m., Ashutosh Chauhan wrote:
> > standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
> > Lines 703-706 (patched)
> > <https://reviews.apache.org/r/66369/diff/1/?file=1990424#file1990424line703>
> >
> >     These objects are not always needed. If so, we can delay getting them 
> > until needed.

_materializationsRebuildLockHandler_ is needed by the _try_ and _finally_ 
clauses, hence I would not move it (we would need to call method twice).
_materializationsInvalidationCache_ is only needed in the _try_ clause, I have 
moved it.


> On April 3, 2018, 11:34 p.m., Ashutosh Chauhan wrote:
> > standalone-metastore/src/main/thrift/hive_metastore.thrift
> > Lines 1164-1165 (original), 1164-1166 (patched)
> > <https://reviews.apache.org/r/66369/diff/1/?file=1990426#file1990426line1164>
> >
> >     Shouldn't they be required?

These two fields may not be set when Materialization is created. Then, when it 
is retrieved, they are populated with invalidation time and whether there was a 
source table update/delete when we retrieve them from the invalidation cache.


- Jesús


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66369/#review200335
-----------------------------------------------------------


On March 29, 2018, 8:56 p.m., Jesús Camacho Rodríguez wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66369/
> -----------------------------------------------------------
> 
> (Updated March 29, 2018, 8:56 p.m.)
> 
> 
> Review request for hive and Ashutosh Chauhan.
> 
> 
> Bugs: HIVE-18839
>     https://issues.apache.org/jira/browse/HIVE-18839
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> HIVE-18839
> 
> 
> Diffs
> -----
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 
> 697b1940a928000d23e414025d990ef50d20ea69 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java 
> 01cadeaaab2c8dd97b081d59baa06f450338027f 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/MaterializedViewTask.java 
> de120afbbcd02dc23dff163c14fd7c8fba25db0c 
>   ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DbTxnManager.java 
> 89ca1ff5cf1135cda85090ad2d03088b51680151 
>   ql/src/java/org/apache/hadoop/hive/ql/lockmgr/HiveTxnManager.java 
> 0db2a2c3ed5fda2352f2f63acbce8c6b4b889b84 
>   ql/src/java/org/apache/hadoop/hive/ql/lockmgr/HiveTxnManagerImpl.java 
> c8cafa2a68f897b1034e5bf61e872e044e01c22a 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 
> d76c8b6f40e8ab8a62875f6801d40b6c23c0ae85 
>   
> ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveMaterializedViewsRegistry.java
>  53dc8ec1974dfc095bd5c7601c8d486b4712319e 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/views/HiveAggregateIncrementalRewritingRule.java
>  PRE-CREATION 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/views/HiveNoAggregateIncrementalRewritingRule.java
>  PRE-CREATION 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/views/MaterializedViewRewritingRelVisitor.java
>  PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/CalcitePlanner.java 
> 41de17fd4679009ef6a4fb5a6d976cbc794ce791 
>   
> ql/src/java/org/apache/hadoop/hive/ql/parse/MaterializedViewRebuildSemanticAnalyzer.java
>  75eb50c5797b312f66e81b5cec23849684e641fc 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 
> 26f20f2e05478fdbd05e494bf1a843567f2dff2c 
>   ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDTFGetSplits.java 
> 4c86fb89376041b30bc5c90a4e84f2e685d8ff82 
>   ql/src/test/queries/clientpositive/materialized_view_create_rewrite_4.q 
> c7f050be8da71f13463c22df2ccdab21c0e40b6a 
>   ql/src/test/queries/clientpositive/materialized_view_create_rewrite_5.q 
> PRE-CREATION 
>   ql/src/test/results/clientpositive/materialized_view_create_rewrite_3.q.out 
> 65614566c9b738ea0354dc6d206a0ed7e635174e 
>   ql/src/test/results/clientpositive/materialized_view_create_rewrite_4.q.out 
> 48c0ecb23f1657c99ef30ed6963a3af1737f0514 
>   ql/src/test/results/clientpositive/materialized_view_create_rewrite_5.q.out 
> PRE-CREATION 
>   standalone-metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore.h 
> e10a655e762eaaa5b0c32b2d1cf43a80a42410bc 
>   standalone-metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore.cpp 
> 0f49d93277165e572c9b607eeed9ab3a168c1c0b 
>   
> standalone-metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore_server.skeleton.cpp
>  d7319e247524d82603bc27d1bb791cbe1c79d740 
>   standalone-metastore/src/gen/thrift/gen-cpp/hive_metastore_types.h 
> 724f02229b54f0e9a7ec8ab0546a83650860d1f1 
>   standalone-metastore/src/gen/thrift/gen-cpp/hive_metastore_types.cpp 
> 8e357f6422fecfecac5a558c04ec741f06b87a62 
>   
> standalone-metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/Materialization.java
>  fec35d50b7009439da7f7b40911e4816996fb4fc 
>   
> standalone-metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/ThriftHiveMetastore.java
>  e824f4a145cd049eb712b372dbf7fea3b7c36d42 
>   
> standalone-metastore/src/gen/thrift/gen-php/metastore/ThriftHiveMetastore.php 
> d00d11be3e054b805cb5c09c0222da2aa3a013f7 
>   standalone-metastore/src/gen/thrift/gen-php/metastore/Types.php 
> 32c8cb7bc57eb09a67bb971bcfe49155f6790a78 
>   
> standalone-metastore/src/gen/thrift/gen-py/hive_metastore/ThriftHiveMetastore-remote
>  1b3ebcf6ef9775785008bd3b549516f8932dd64f 
>   
> standalone-metastore/src/gen/thrift/gen-py/hive_metastore/ThriftHiveMetastore.py
>  cf36654b5127a445577197dbf365352081e64e96 
>   standalone-metastore/src/gen/thrift/gen-py/hive_metastore/ttypes.py 
> fd157b2781f0746234ab62548855f14b5fc4ed64 
>   standalone-metastore/src/gen/thrift/gen-rb/hive_metastore_types.rb 
> aa93158fa621e48552c0d108b68a12d8ed3589e9 
>   standalone-metastore/src/gen/thrift/gen-rb/thrift_hive_metastore.rb 
> 7a07b73cc709684e8c548ebbbe4882d562f069e5 
>   
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
>  8a5de09ccf1ebc21a86f479a65bb6e78defc3c31 
>   
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java
>  d4bdcf1a79b18e61eb5d427646019a6dbcc9b4c0 
>   
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/IMetaStoreClient.java
>  c69192b73195dcfae3f2567d419ef0b0cf913a1e 
>   
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/MaterializationInvalidationInfo.java
>  3d774071c2ff00e892540364a7ec63e49b8089df 
>   
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/MaterializationsInvalidationCache.java
>  1636d48d2c7dd21ea25dd3d670103ee3c618df38 
>   
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/MaterializationsRebuildLockCleanerTask.java
>  PRE-CREATION 
>   
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/MaterializationsRebuildLockHandler.java
>  PRE-CREATION 
>   
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/conf/MetastoreConf.java
>  b8976ed953d1c449ce7d78b3cd37b39218e42bc2 
>   
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
>  e453e5a7d2ec75a962194ae95a866aafee6e1b47 
>   
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnStore.java
>  38fa0e2daa17fddda139087d20bc43573af0ea9c 
>   standalone-metastore/src/main/thrift/hive_metastore.thrift 
> 68d7b45d0dae8bc8509a672c39dd57b029433e3c 
>   
> standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/TestMetaStoreMaterializationsCacheCleaner.java
>  6d3f68c0f0723dcdf15ee80de91a6c5a18c202ef 
> 
> 
> Diff: https://reviews.apache.org/r/66369/diff/1/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Jesús Camacho Rodríguez
> 
>

Reply via email to