[ https://issues.apache.org/jira/browse/IMPALA-9532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17063658#comment-17063658 ]
Vihang Karajgaonkar commented on IMPALA-9532: --------------------------------------------- IMPALA-9483 might be a different issue since it is related to {{BuiltinsDb and it doesn't get affected during the global invalidate.}} > Functions can disappear when a concurrent invalidate metadata is running > ------------------------------------------------------------------------ > > Key: IMPALA-9532 > URL: https://issues.apache.org/jira/browse/IMPALA-9532 > Project: IMPALA > Issue Type: Bug > Components: Catalog > Reporter: Vihang Karajgaonkar > Priority: Major > Labels: concurrency > > The global invalidate metadata takes a write lock on the {{versionLock_}}. > However, the locking protocol for ddls release the {{versionLock_}} as soon > as the table level lock is acquired. This allows for a concurrent > {{invalidate metadata}} to run while the DDL operation is in progress. This > can lead to weird race conditions. One such example is below can lead to > functions disappearing from the catalog until a invalidate metadata is issued > again. > Following sequence of events can reproduce this race condition: > {noformat} > [localhost:21000] default> create function default.f() returns int location > '/test-warehouse/libTestUdfs.so' symbol='NoArgs'; > Query: create function default.f() returns int location > '/test-warehouse/libTestUdfs.so' symbol='NoArgs' > +----------------------------+ > | summary | > +----------------------------+ > | Function has been created. | > +----------------------------+ > Fetched 1 row(s) in 10.26s > --> Session 2 invokes invalidate metadata concurrently > [localhost:21001] default> invalidate metadata; Query: invalidate metadata > Query submitted at: 2020-03-18 15:04:25 (Coordinator: > http://vihang-Precision-21575:25001) Query progress can be monitored at: > http://<redacted>/query_plan?query_id=d3463484ff635684:620fbfef00000000 > Fetched 0 row(s) in 4.30s > --> drop function from session1 says function does not exist but show > functions shows it. > [localhost:21000] default> drop function f(); > Query: drop function f() > ERROR: CatalogException: Function: f() does not exist. > [localhost:21000] default> show functions; > Query: show functions > +-------------+-----------+-------------+---------------+ > | return type | signature | binary type | is persistent | > +-------------+-----------+-------------+---------------+ > | INT | f() | NATIVE | true | > +-------------+-----------+-------------+---------------+ > Fetched 1 row(s) in 0.01s > [localhost:21000] default> > -- Session 2 never sees the function f: > [localhost:21001] default> show functions; > Query: show functions > Fetched 0 row(s) in 0.00s > {noformat} > When the create function statement is executing in {{CatalogOpExecutor}} we > apply the alterDatabase in HMS to persist the new db parameters here: > [https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java#L1409] > Note the we have released the {{versionLock_}} by line 1409. Meanwhile a > concurrent {{invalidate metadata}} fetches the db params from HMS here > [https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java#L1326] > which will override the parameters of the newly created Db object. Hence > effectively we are removing the function from the parameters since the > operation 1 to alterDatabase is not yet committed in HMS. > All subsequent commands of {{show functions}}, {{drop function}} will show > inconsistent results. I was able to reproduce this race condition by added a > sleep statement just before the alterDatabase call in the createFunction > method. > Note: Above code links are based of commit hash > {{7dd13f72784514a59f82c9a7a5e2250503dbfaf0}} -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org