[ 
https://issues.apache.org/jira/browse/SPARK-11470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14992233#comment-14992233
 ] 

Michael Armbrust commented on SPARK-11470:
------------------------------------------

The more I think about this, the more that I think that its just a 
documentation issue.  I would say the `deterministic` as it applies to Spark 
should mean: "Given some set of inputs, this function will always return the 
same result".  EvaluateOnce or things of that nature are confusing.  It really 
means evaluate only one within any given task attempt, but possibly evaluate 
again if there is a failure.  So changing the name isn't going to help anyone 
as they still need to correctly ensure determinism in the face of retries.

Regarding "reuse result", that should be the default.  We are close to adding 
sub expression elimination so I don't see any reason to expand the API further 
to give a flag that should always be true.

> Figure out a good name for the public API
> -----------------------------------------
>
>                 Key: SPARK-11470
>                 URL: https://issues.apache.org/jira/browse/SPARK-11470
>             Project: Spark
>          Issue Type: Sub-task
>          Components: SQL
>            Reporter: Yin Huai
>            Assignee: Yin Huai
>            Priority: Blocker
>
> Right now, the public API is called {{nondeterministic}}. As pointed out at 
> [here | 
> https://issues.apache.org/jira/browse/SPARK-11438?focusedCommentId=14986377&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14986377],
>  this name is confusing. We should look into better name for this API. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to