Hello drillers,
I'm still puzzling the purpose of the "selection" attribute of the "Scan"
operator and the "ref" attribute of various operators such as "Scan",
"Transform", "Group".
I notice that "selection" is not used (which is good, since there is no
"activity" attribute in donuts.json).
I understand that "ref" chooses the output expression(s) of each operator, and
see those expressions are necessary. But I don't understand why every "ref" in
simple_plan.json is prefixed with "donuts".
My understanding is that each operator's input and output is a JSON array. The
elements of that array (the "rows" in SQL parlance) are usually JSON objects
(i.e. records with named fields) but might sometimes be scalars or arrays.
The output of the "aggregate" operator in simple_plan.json would be something
like
[
{
"donuts": {
"sales" : 1099.22,
"typeCount" : 1,
"quantity" : 10000,
"ppu" : 0.11
},
{
"donuts": {
"sales" : 109.71,
"typeCount" : 2,
"quantity" : 159,
"ppu" : 0.69
}
},
{
"donuts": {
"sales" : 184.25,
"typeCount" : 2,
"quantity" : 335,
"ppu" : 0.55
}
]
The output is a list of objects, each of which has just one field "donuts",
whose value is an object. The only purpose of the "donuts" prefix is to
increase the nesting level. And other operators do the same thing. It would
seem to me more natural to just use one level of nesting:
[
{
"sales" : 1099.22,
"typeCount" : 1,
"quantity" : 10000,
"ppu" : 0.11
},
...
]
Of course it's not wrong to do this, but I wanted to ask why someone would
choose an extra level of nesting. Or to check whether my understanding was
wrong. (I'm pondering how to make a SQL front-end generate something like
simple_plan.json and right now I can see no reason why it would generate a ref
values with a "donuts." prefix.)
Is the intent of "selection" to remove a level of nesting when reading a source?
Julian