Github user paul-rogers commented on the issue:
https://github.com/apache/drill/pull/1014
@ilooner, there are two phases of issues. The problem is in the planner,
but we've been talking about serialization out to the workers.
Here, we can learn from the work Arina did with dynamic UDFs, especially in
the synchronization aspects as it has potential challenges similar to what we
have here.
Let's walk through the lifecycle.
* User A defines a storage plugin config, call it P.
* User B else runs a query Q that references P.
* At the same time, user A changes P to P'.
* Query Q is distributed to nodes.
* At the same time, user A changes P again to produce P''.
This is a classic distributed system synchronization problem. As we
discussed, once the query is planned, the contents of the plugin definition are
serialized as part of the physical plan for each fragment. As a result we
"freeze" the contents of P at the time of serialization.
The problem that Arina seems to be describing is during planning. Rather
than taking a copy of P at some fixed point in time, then always using that
copy; we seem to be holding a named reference to P. This introduces the obvious
synchronization issues.
So, what we should do is, once a query resolves P, the query takes a copy
and uses that copy for the rest of query planning and execution.
This resolves race conditions by saying that a query (on all nodes) uses
the version of the plugin definition that existed at the moment that the
reference to the plugin definition was first resolved by that query. The query
will be oblivious to all subsequent updates.
Since plugin definition is done as part of workspace and table definition
in Calcite, this is likely to be a tricky change; it is not clear that Calcite
offers a per-query context to hold such information. Arina probably can offer
advice on this front.
---