Actually there are not so many Java test cases in Spark (because Scala runs
on JVM as everybody knows)[1].

Given that, I think we can avoid to put some efforts on this for now .. I
don't mind if somebody wants to give a shot since it looks good anyway but
to me I wouldn't spend so much time on this ..

Let me just go ahead as I suggested if you don't mind. Anyone can give a
shot for Display Name - I'm willing to actively review and help.

[1]
git ls-files '*Suite.java' | wc -l
     172
git ls-files '*Suite.scala' | wc -l
    1161

2019년 11월 18일 (월) 오전 3:27, Steve Loughran <ste...@cloudera.com>님이 작성:

> Test reporters do often contain some assumptions about the characters in
> the test methods. Historically JUnit XML reporters have never sanitised the
> method names so XML injection attacks have been fairly trivial. Haven't
> tried this for a while.
>
> That whole JUnit XML report "standard" was actually put together in the
> Ant project with <Junitreport> doing the postprocessing of the JUnit run.
> It was driven by the team's XSL skills than any overreaching strategic goal
> about how to present test results of tests which could run for hours and
> whose output you would really want to aggregate the locks from multiple
> machines and processes and present in awake you can actually navigate. With
> hindsight, a key failing is that we chose to store the test summaries (test
> count, failure count...) as attributes on the root XML mode. Which is why
> the whole DOM gets built up in the JUnit runner. Which is why when that
> JUnit process crashes, you get no report at all.
>
> It'd be straightforward to fix -except too much relies on that file
> now...important things will break. And the maven runner has historically
> never supported custom reporters, to let you experiment with it.
>
> Maybe this is an opportunity to change things.
>
> On Sun, Nov 17, 2019 at 1:42 AM Hyukjin Kwon <gurwls...@gmail.com> wrote:
>
>> DisplayName looks good in general but actually here I would like first to
>> find a existing pattern to document in guidelines given the actual existing
>> practice we all are used to. I'm trying to be very conservative since this
>> guidelines affect everybody.
>>
>> I think it might be better to discuss separately if we want to change
>> what we have been used to.
>>
>> Also, using arbitrary names might not be actually free due to such bug
>> like https://github.com/apache/spark/pull/25630 . It will need some more
>> efforts to investigate as well.
>>
>> On Fri, 15 Nov 2019, 20:56 Steve Loughran, <ste...@cloudera.com.invalid>
>> wrote:
>>
>>>  Junit5: Display names.
>>>
>>> Goes all the way to the XML.
>>>
>>>
>>> https://junit.org/junit5/docs/current/user-guide/#writing-tests-display-names
>>>
>>> On Thu, Nov 14, 2019 at 6:13 PM Shixiong(Ryan) Zhu <
>>> shixi...@databricks.com> wrote:
>>>
>>>> Should we also add a guideline for non Scala tests? Other languages
>>>> (Java, Python, R) don't support using string as a test name.
>>>>
>>>> Best Regards,
>>>> Ryan
>>>>
>>>>
>>>> On Thu, Nov 14, 2019 at 4:04 AM Hyukjin Kwon <gurwls...@gmail.com>
>>>> wrote:
>>>>
>>>>> I opened a PR - https://github.com/apache/spark-website/pull/231
>>>>>
>>>>> 2019년 11월 13일 (수) 오전 10:43, Hyukjin Kwon <gurwls...@gmail.com>님이 작성:
>>>>>
>>>>>> > In general a test should be self descriptive and I don't think we
>>>>>> should be adding JIRA ticket references wholesale. Any action that the
>>>>>> reader has to take to understand why a test was introduced is one too 
>>>>>> many.
>>>>>> However in some cases the thing we are trying to test is very subtle and 
>>>>>> in
>>>>>> that case a reference to a JIRA ticket might be useful, I do still feel
>>>>>> that this should be a backstop and that properly documenting your tests 
>>>>>> is
>>>>>> a much better way of dealing with this.
>>>>>>
>>>>>> Yeah, the test should be self-descriptive. I don't think adding a
>>>>>> JIRA prefix harms this point. Probably I should add this sentence in the
>>>>>> guidelines as well.
>>>>>> Adding a JIRA prefix just adds one extra hint to track down details.
>>>>>> I think it's fine to stick to this practice and make it simpler and clear
>>>>>> to follow.
>>>>>>
>>>>>> > 1. what if multiple JIRA IDs relating to the same test? we just
>>>>>> take the very first JIRA ID?
>>>>>> Ideally one JIRA should describe one issue and one PR should fix one
>>>>>> JIRA with a dedicated test.
>>>>>> Yeah, I think I would take the very first JIRA ID.
>>>>>>
>>>>>> > 2. are we going to have a full scan of all existing tests and
>>>>>> attach a JIRA ID to it?
>>>>>> Yea, let's don't do this.
>>>>>>
>>>>>> > It's a nice-to-have, not super essential, just because ...
>>>>>> It's been asked multiple times and each committer seems having a
>>>>>> different understanding on this.
>>>>>> It's not a biggie but wanted to make it clear and conclude this.
>>>>>>
>>>>>> > I'd add this only when a test specifically targets a certain issue.
>>>>>> Yes, so this one I am not sure. From what I heard, people adds the
>>>>>> JIRA in cases below:
>>>>>>
>>>>>> - Whenever the JIRA type is a bug
>>>>>> - When a PR adds a couple of tests
>>>>>> - Only when a test specifically targets a certain issue.
>>>>>> - ...
>>>>>>
>>>>>> Which one do we prefer and simpler to follow?
>>>>>>
>>>>>> Or I can combine as below (im gonna reword when I actually document
>>>>>> this):
>>>>>> 1. In general, we should add a JIRA ID as prefix of a test when a PR
>>>>>> targets to fix a specific issue.
>>>>>>     In practice, it usually happens when a JIRA type is a bug or a PR
>>>>>> adds a couple of tests.
>>>>>> 2. Uses "SPARK-XXXX: test name" format
>>>>>>
>>>>>> If we have no objection with ^, let me go with this.
>>>>>>
>>>>>> 2019년 11월 13일 (수) 오전 8:14, Sean Owen <sro...@gmail.com>님이 작성:
>>>>>>
>>>>>>> Let's suggest "SPARK-12345:" but not go back and change a bunch of
>>>>>>> test cases.
>>>>>>> I'd add this only when a test specifically targets a certain issue.
>>>>>>> It's a nice-to-have, not super essential, just because in the rare
>>>>>>> case you need to understand why a test asserts something, you can go
>>>>>>> back and find what added it in the git history without much trouble.
>>>>>>>
>>>>>>> On Mon, Nov 11, 2019 at 10:46 AM Hyukjin Kwon <gurwls...@gmail.com>
>>>>>>> wrote:
>>>>>>> >
>>>>>>> > Hi all,
>>>>>>> >
>>>>>>> > Maybe it's not a big deal but it brought some confusions time to
>>>>>>> time into Spark dev and community. I think it's time to discuss about
>>>>>>> when/which format to add a JIRA ID as a prefix for the test case name in
>>>>>>> Scala test cases.
>>>>>>> >
>>>>>>> > Currently we have many test case names with prefixes as below:
>>>>>>> >
>>>>>>> > test("SPARK-XXXXX blah blah")
>>>>>>> > test("SPARK-XXXXX: blah blah")
>>>>>>> > test("SPARK-XXXXX - blah blah")
>>>>>>> > test("[SPARK-XXXXX] blah blah")
>>>>>>> > …
>>>>>>> >
>>>>>>> > It is a good practice to have the JIRA ID in general because, for
>>>>>>> instance,
>>>>>>> > it makes us put less efforts to track commit histories (or even
>>>>>>> when the files
>>>>>>> > are totally moved), or to track related information of tests
>>>>>>> failed.
>>>>>>> > Considering Spark's getting big, I think it's good to document.
>>>>>>> >
>>>>>>> > I would like to suggest this and document it in our guideline:
>>>>>>> >
>>>>>>> > 1. Add a prefix into a test name when a PR adds a couple of tests.
>>>>>>> > 2. Uses "SPARK-XXXX: test name" format which is used in our code
>>>>>>> base most
>>>>>>> >       often[1].
>>>>>>> >
>>>>>>> > We should make it simple and clear but closer to the actual
>>>>>>> practice. So, I would like to listen to what other people think. I would
>>>>>>> appreciate if you guys give some feedback about when to add the JIRA
>>>>>>> prefix. One alternative is that, we only add the prefix when the JIRA's
>>>>>>> type is bug.
>>>>>>> >
>>>>>>> > [1]
>>>>>>> > git grep -E 'test\("\SPARK-([0-9]+):' | wc -l
>>>>>>> >      923
>>>>>>> > git grep -E 'test\("\SPARK-([0-9]+) ' | wc -l
>>>>>>> >      477
>>>>>>> > git grep -E 'test\("\[SPARK-([0-9]+)\]' | wc -l
>>>>>>> >       16
>>>>>>> > git grep -E 'test\("\SPARK-([0-9]+) -' | wc -l
>>>>>>> >       13
>>>>>>> >
>>>>>>> >
>>>>>>> >
>>>>>>>
>>>>>>

Reply via email to