I looked at this issue a while ago and concluded that pure immutability was a 
bad idea.

If the tree is 100% immutable then to change one leaf you have to copy the 
entire tree. But if we allow inputs to be changed — basically, changing the 
arcs of the graph but not the nodes — then we can swap out pieces of the tree, 
but we still have the guarantee that nodes won’t change under the Volcano 
planner’s feet. The Volcano planner really needs nodes to be immutable, 
especially after they have been registered. But of course it allows the inputs 
of a node (or the possible inputs) to change when you add nodes to a RelSubset.

Other parts of the system don’t need immutability. For instance, HepPlanner 
relies on replaceInput.

For mapping queries onto materialized views, I needed to be able to change 
individual attributes of relational expressions, and have relational 
expressions know about their parents (outputs), so I created 
SubsitutionVisitor.MutableRel and its sub-classes. 

Mutability gone wild makes the overall planning process difficult to maintain. 
Mutability is manageable as long as mutable expressions are only used within a 
single phase of the planning process and are copied to immutable expressions 
for the next phase.

Code in planner rules tends to call copy (or use factories, which are similar). 
It is not necessary to copy inputs. Only “system” code, that understands what 
it is doing, should call RelNode.replaceInput. (The Hive code should not be 
calling replaceInput.) So, I think we’re basically good. Maybe we should make 
it clear in replaceInput’s javadoc that it is considered unsafe.

This is very much like the debate about pure functional programming languages 
(see http://queue.acm.org/detail.cfm?id=2611829) but I disagree with purists 
like Erik Meijer. As you can tell, I’m in the “everything in moderation” camp.

Julian




On Dec 13, 2014, at 8:35 AM, Vladimir Sitnikov <[email protected]> 
wrote:

> Hi,
> 
> It turns out RelNode is mutable: replaceInput, well, replaces input in place.
> This replaceInput has just 5 usages (and 1 in tests) in Calcite source.
> Drill does not use replaceInput.
> Hive uses it a bit:
> https://github.com/apache/hive/search?utf8=%E2%9C%93&q=replaceInput&type=Code
> 
> This mutability makes RelNode.copy() semantics ill-defined. It is not
> clear if the method is safe to avoid copy or not.
> 
> I guess we need either go with immutable RelNode or always copy in
> .copy methods.
> 
> Any more thoughts on the subject?
> 
> -- 
> Regards,
> Vladimir Sitnikov

Reply via email to