Actually there are not so many Java test cases in Spark (because Scala runs on JVM as everybody knows)[1].
Given that, I think we can avoid to put some efforts on this for now .. I don't mind if somebody wants to give a shot since it looks good anyway but to me I wouldn't spend so much time on this .. Let me just go ahead as I suggested if you don't mind. Anyone can give a shot for Display Name - I'm willing to actively review and help. [1] git ls-files '*Suite.java' | wc -l 172 git ls-files '*Suite.scala' | wc -l 1161 2019년 11월 18일 (월) 오전 3:27, Steve Loughran <ste...@cloudera.com>님이 작성: > Test reporters do often contain some assumptions about the characters in > the test methods. Historically JUnit XML reporters have never sanitised the > method names so XML injection attacks have been fairly trivial. Haven't > tried this for a while. > > That whole JUnit XML report "standard" was actually put together in the > Ant project with <Junitreport> doing the postprocessing of the JUnit run. > It was driven by the team's XSL skills than any overreaching strategic goal > about how to present test results of tests which could run for hours and > whose output you would really want to aggregate the locks from multiple > machines and processes and present in awake you can actually navigate. With > hindsight, a key failing is that we chose to store the test summaries (test > count, failure count...) as attributes on the root XML mode. Which is why > the whole DOM gets built up in the JUnit runner. Which is why when that > JUnit process crashes, you get no report at all. > > It'd be straightforward to fix -except too much relies on that file > now...important things will break. And the maven runner has historically > never supported custom reporters, to let you experiment with it. > > Maybe this is an opportunity to change things. > > On Sun, Nov 17, 2019 at 1:42 AM Hyukjin Kwon <gurwls...@gmail.com> wrote: > >> DisplayName looks good in general but actually here I would like first to >> find a existing pattern to document in guidelines given the actual existing >> practice we all are used to. I'm trying to be very conservative since this >> guidelines affect everybody. >> >> I think it might be better to discuss separately if we want to change >> what we have been used to. >> >> Also, using arbitrary names might not be actually free due to such bug >> like https://github.com/apache/spark/pull/25630 . It will need some more >> efforts to investigate as well. >> >> On Fri, 15 Nov 2019, 20:56 Steve Loughran, <ste...@cloudera.com.invalid> >> wrote: >> >>> Junit5: Display names. >>> >>> Goes all the way to the XML. >>> >>> >>> https://junit.org/junit5/docs/current/user-guide/#writing-tests-display-names >>> >>> On Thu, Nov 14, 2019 at 6:13 PM Shixiong(Ryan) Zhu < >>> shixi...@databricks.com> wrote: >>> >>>> Should we also add a guideline for non Scala tests? Other languages >>>> (Java, Python, R) don't support using string as a test name. >>>> >>>> Best Regards, >>>> Ryan >>>> >>>> >>>> On Thu, Nov 14, 2019 at 4:04 AM Hyukjin Kwon <gurwls...@gmail.com> >>>> wrote: >>>> >>>>> I opened a PR - https://github.com/apache/spark-website/pull/231 >>>>> >>>>> 2019년 11월 13일 (수) 오전 10:43, Hyukjin Kwon <gurwls...@gmail.com>님이 작성: >>>>> >>>>>> > In general a test should be self descriptive and I don't think we >>>>>> should be adding JIRA ticket references wholesale. Any action that the >>>>>> reader has to take to understand why a test was introduced is one too >>>>>> many. >>>>>> However in some cases the thing we are trying to test is very subtle and >>>>>> in >>>>>> that case a reference to a JIRA ticket might be useful, I do still feel >>>>>> that this should be a backstop and that properly documenting your tests >>>>>> is >>>>>> a much better way of dealing with this. >>>>>> >>>>>> Yeah, the test should be self-descriptive. I don't think adding a >>>>>> JIRA prefix harms this point. Probably I should add this sentence in the >>>>>> guidelines as well. >>>>>> Adding a JIRA prefix just adds one extra hint to track down details. >>>>>> I think it's fine to stick to this practice and make it simpler and clear >>>>>> to follow. >>>>>> >>>>>> > 1. what if multiple JIRA IDs relating to the same test? we just >>>>>> take the very first JIRA ID? >>>>>> Ideally one JIRA should describe one issue and one PR should fix one >>>>>> JIRA with a dedicated test. >>>>>> Yeah, I think I would take the very first JIRA ID. >>>>>> >>>>>> > 2. are we going to have a full scan of all existing tests and >>>>>> attach a JIRA ID to it? >>>>>> Yea, let's don't do this. >>>>>> >>>>>> > It's a nice-to-have, not super essential, just because ... >>>>>> It's been asked multiple times and each committer seems having a >>>>>> different understanding on this. >>>>>> It's not a biggie but wanted to make it clear and conclude this. >>>>>> >>>>>> > I'd add this only when a test specifically targets a certain issue. >>>>>> Yes, so this one I am not sure. From what I heard, people adds the >>>>>> JIRA in cases below: >>>>>> >>>>>> - Whenever the JIRA type is a bug >>>>>> - When a PR adds a couple of tests >>>>>> - Only when a test specifically targets a certain issue. >>>>>> - ... >>>>>> >>>>>> Which one do we prefer and simpler to follow? >>>>>> >>>>>> Or I can combine as below (im gonna reword when I actually document >>>>>> this): >>>>>> 1. In general, we should add a JIRA ID as prefix of a test when a PR >>>>>> targets to fix a specific issue. >>>>>> In practice, it usually happens when a JIRA type is a bug or a PR >>>>>> adds a couple of tests. >>>>>> 2. Uses "SPARK-XXXX: test name" format >>>>>> >>>>>> If we have no objection with ^, let me go with this. >>>>>> >>>>>> 2019년 11월 13일 (수) 오전 8:14, Sean Owen <sro...@gmail.com>님이 작성: >>>>>> >>>>>>> Let's suggest "SPARK-12345:" but not go back and change a bunch of >>>>>>> test cases. >>>>>>> I'd add this only when a test specifically targets a certain issue. >>>>>>> It's a nice-to-have, not super essential, just because in the rare >>>>>>> case you need to understand why a test asserts something, you can go >>>>>>> back and find what added it in the git history without much trouble. >>>>>>> >>>>>>> On Mon, Nov 11, 2019 at 10:46 AM Hyukjin Kwon <gurwls...@gmail.com> >>>>>>> wrote: >>>>>>> > >>>>>>> > Hi all, >>>>>>> > >>>>>>> > Maybe it's not a big deal but it brought some confusions time to >>>>>>> time into Spark dev and community. I think it's time to discuss about >>>>>>> when/which format to add a JIRA ID as a prefix for the test case name in >>>>>>> Scala test cases. >>>>>>> > >>>>>>> > Currently we have many test case names with prefixes as below: >>>>>>> > >>>>>>> > test("SPARK-XXXXX blah blah") >>>>>>> > test("SPARK-XXXXX: blah blah") >>>>>>> > test("SPARK-XXXXX - blah blah") >>>>>>> > test("[SPARK-XXXXX] blah blah") >>>>>>> > … >>>>>>> > >>>>>>> > It is a good practice to have the JIRA ID in general because, for >>>>>>> instance, >>>>>>> > it makes us put less efforts to track commit histories (or even >>>>>>> when the files >>>>>>> > are totally moved), or to track related information of tests >>>>>>> failed. >>>>>>> > Considering Spark's getting big, I think it's good to document. >>>>>>> > >>>>>>> > I would like to suggest this and document it in our guideline: >>>>>>> > >>>>>>> > 1. Add a prefix into a test name when a PR adds a couple of tests. >>>>>>> > 2. Uses "SPARK-XXXX: test name" format which is used in our code >>>>>>> base most >>>>>>> > often[1]. >>>>>>> > >>>>>>> > We should make it simple and clear but closer to the actual >>>>>>> practice. So, I would like to listen to what other people think. I would >>>>>>> appreciate if you guys give some feedback about when to add the JIRA >>>>>>> prefix. One alternative is that, we only add the prefix when the JIRA's >>>>>>> type is bug. >>>>>>> > >>>>>>> > [1] >>>>>>> > git grep -E 'test\("\SPARK-([0-9]+):' | wc -l >>>>>>> > 923 >>>>>>> > git grep -E 'test\("\SPARK-([0-9]+) ' | wc -l >>>>>>> > 477 >>>>>>> > git grep -E 'test\("\[SPARK-([0-9]+)\]' | wc -l >>>>>>> > 16 >>>>>>> > git grep -E 'test\("\SPARK-([0-9]+) -' | wc -l >>>>>>> > 13 >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> >>>>>>