There is an annotation on the function template. I don't have a laptop close but I believe it is something similar to isRandom. It basically tells Drill that this is a nondeterministic function. I will be more specific once I get back to my machine if you don't find it sooner.
Jacques *Summary:* Drill is very aggressive about optimizing away calls to functions with constant arguments. I worry that could extend to per record batch optimization if I accidentally have constant values and even if that doesn't happen, it is a pain in the ass now largely because Drill is clever enough to see through my attempt to hide the constant nature of my parameters. *Question:* Is there a way to mark a UDF as not being a pure function? *Details:* I have written a UDF to generate a random number. It takes parameters that define the distribution. All seems well and good. I find, however, that the function is only called once (twice, actually apparently due to pipeline warmup) and then Drill optimizes away later calls, apparently because the parameters to the function are constant and Drill thinks my function is a pure function. If I make up some bogus data to pass in as a parameter, all is well and the function is called as much as I wanted. For instance, with the uniform distribution, my function takes two arguments, those being the minimum and maximum value to return. Here is what I see with constants for the min and max: 0: jdbc:drill:zk=local> select random(0,10) from (values 5,5,5,5) as tbl(x); into eval into eval +---------------------+ | EXPR$0 | +---------------------+ | 1.7787372583008298 | | 1.7787372583008298 | | 1.7787372583008298 | | 1.7787372583008298 | +---------------------+ If I include an actual value, we see more interesting behavior even if the value is effectively constant: 0: jdbc:drill:zk=local> select random(0,x) from (values 5,5,5,5) as tbl(x); into eval into eval into eval into eval +----------------------+ | EXPR$0 | +----------------------+ | 3.688377805419459 | | 0.2827056410711032 | | 2.3107479622644918 | | 0.10813788169218574 | +----------------------+ 4 rows selected (0.088 seconds) Even if I make the max value come along from the sub-query, I get the evil behavior although the function is now surprisingly actually called three times, apparently to do with warming up the pipeline: 0: jdbc:drill:zk=local> select random(0,max_value) from (select 14 as max_value,x from (values 5,5,5,5) as tbl(x)) foo; into eval into eval into eval +---------------------+ | EXPR$0 | +---------------------+ | 13.404462063773702 | | 13.404462063773702 | | 13.404462063773702 | | 13.404462063773702 | +---------------------+ 4 rows selected (0.121 seconds) The UDF itself is boring and can be found at https://gist.github.com/tdunning/0c2cc2089e6cd8c030c0 So how can I defeat this behavior?