Vihang Karajgaonkar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13300 )

Change subject: IMPALA-8447: [DOCS] INSERT event is supported in automatic 
invalidation
......................................................................


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/13300/1/docs/topics/impala_metadata.xml
File docs/topics/impala_metadata.xml:

http://gerrit.cloudera.org:8080/#/c/13300/1/docs/topics/impala_metadata.xml@161
PS1, Line 161: Refreshes
> Do you mean the table has to have some data before receiving the INSERT eve
by table being loaded, I meant that the metadata of the table is already loaded 
by Catalog service. So consider the following examples:

Assume a table foo exists in Catalog service when it comes up fro the first 
time. Since no query has been issued on foo, the metadata is not loaded so far 
in Catalog.
1. User issues a select query on table foo. The table's metadata gets loaded in 
Catalog Service.
2. After some time, another ETL process in Hive, inserts data into table foo. 
This will produce a INSERT event. The ETL process from Hive is just an example. 
It could be another Impala cluster doing insert data as well.
3. When the insert event is processed, the table is loaded in catalog service. 
Event processor will refresh (meaning refetch the metadata) the table so that 
new rows are seen when queried from Impala.

Now consider this example:
There is no step 1 above. So table was not loaded in Catalog.
When step 2 occurs, the INSERT event is received by Event processor but since 
the table is not loaded, it does NOT refresh the table. However, as soon as a 
query is issued on table foo, its metadata gets fetched and it starts showing 
the new rows. In theory, in this example, there was no need of Event Processor 
since the metadata was fetched AFTER the last insert issued on the table.

These are all implementation details for your understanding. Sorry for not 
clarifying it better earlier. As far as documenting this is concerned, we can 
skip the details and just say,

"Refreshes a loaded table when it receives a INSERT event. If the table is not 
loaded at the time of processing the INSERT event, the Event processor does not 
need to refresh the table and skips it."



--
To view, visit http://gerrit.cloudera.org:8080/13300
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I68133b0beeb15cacc73829b8a8b0838fc7f4b7d8
Gerrit-Change-Number: 13300
Gerrit-PatchSet: 1
Gerrit-Owner: Alex Rodoni <arod...@cloudera.com>
Gerrit-Reviewer: Alex Rodoni <arod...@cloudera.com>
Gerrit-Reviewer: Bharath Krishna <bhar...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vih...@cloudera.com>
Gerrit-Comment-Date: Mon, 20 May 2019 16:21:58 +0000
Gerrit-HasComments: Yes

Reply via email to