[ 
https://issues.apache.org/jira/browse/PHOENIX-1516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14244459#comment-14244459
 ] 

Julian Hyde commented on PHOENIX-1516:
--------------------------------------

[~jamestaylor] Oracle uses DBMS_RANDOM package, which doesn't offer any 
consistency guarantees. I think you're right that we should generate a new 
value per row.

"Per row" is probably tricky to define precisely (consider: {{select random() 
from t where random() < 0.1 group by x}}), but I suspect that what you do for 
sequences is the right thing. In fact you could implement random the same way 
as sequences.

Each occurrence of random with different arguments would make a different 
"stream" of random numbers. E.g. {{select random(), random(), random(1), 
random(1), random(2) from t where random(2) < 0.1}} would have 3 streams.

We should not guarantee that executions of the same statement with different 
plans (e.g. because the table has been re-analyzed) give the same results, even 
if the random numbers have the same seeds. This will give the optimizer the 
freedom to push down or even eliminate random functions (while maintaining the 
"per row" contract).

At some point we might want random generators that generate e.g. integers, 
strings, and doubles between 0 and 1. These could be added on top of the 
"random streams" infrastructure quite easily.

> Add RANDOM built-in function
> ----------------------------
>
>                 Key: PHOENIX-1516
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-1516
>             Project: Phoenix
>          Issue Type: Bug
>            Reporter: Lars Hofhansl
>            Assignee: Lars Hofhansl
>         Attachments: 1516.txt
>
>
> I often find it useful to generate some rows with random data.
> Here's a simple RANDOM() function that we could use for that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to