[ 
https://issues.apache.org/jira/browse/SPARK-21914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16287538#comment-16287538
 ] 

Hyukjin Kwon commented on SPARK-21914:
--------------------------------------

I totally misunderstood your comment [~rxin]. I was so focused on the elapsed 
time in the tests at that time but you the actual test coverage ..

Would it be okay if the diff is small? Let me try this in my local and maybe 
open a PR if the diff is small to show it. Otherwise, will leave this closed if 
it's complex and big.

> Running examples as tests in SQL builtin function documentation
> ---------------------------------------------------------------
>
>                 Key: SPARK-21914
>                 URL: https://issues.apache.org/jira/browse/SPARK-21914
>             Project: Spark
>          Issue Type: Test
>          Components: SQL
>    Affects Versions: 2.3.0
>            Reporter: Hyukjin Kwon
>
> It looks we have added many examples in {{ExpressionDescription}} for builtin 
> functions.
> Actually, if I have seen correctly, we have fixed many examples so far in 
> some minor PRs and sometimes require to add the examples as tests sql and 
> golden files.
> As we have formatted examples in {{ExpressionDescription.examples}} - 
> https://github.com/apache/spark/blob/ba327ee54c32b11107793604895bd38559804858/sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/ExpressionDescription.java#L44-L50,
>  and we have `SQLQueryTestSuite`, I think we could run the examples as tests 
> like Python's doctests.
> Rough way I am thinking:
> 1. Loads the example in {{ExpressionDescription}}.
> 2. identify queries by {{>}}.
> 3. identify the rest of them as the results.
> 4. run the examples by reusing {{SQLQueryTestSuite}} if possible.
> 5. compare the output by reusing {{SQLQueryTestSuite}} if possible.
> Advantages of doing this I could think for now:
> - Reduce the number of PRs to fix the examples
> - De-duplicate the test cases that should be added into sql and golden files.
> - Correct documentation with correct examples.
> - Reduce reviewing costs for documentation fix PRs.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to