Hi all,

thanks for bringing up the topic Sean. I agree too with Reynold's idea, but
in the specific case, if there is an error the timezone is part of the
error message.
So we know exactly which timezone caused the failure. Hence I thought that
logging the seed is not necessary, as we can directly use the failing
timezone.

Thanks,
Marco

Il giorno lun 8 ott 2018 alle ore 16:24 Xiao Li <lix...@databricks.com> ha
scritto:

> For this specific case, I do not think we should test all the timezone. If
> this is fast, I am fine to leave it unchanged. However, this is very slow.
> Thus, I even prefer to reducing the tested timezone to a smaller number or
> just hardcoding some specific time zones.
>
> In general, I like Reynold’s idea by including the seed value and we add
> the seed name in the test case name. This can help us reproduce it.
>
> Xiao
>
> On Mon, Oct 8, 2018 at 7:08 AM Reynold Xin <r...@databricks.com> wrote:
>
>> I'm personally not a big fan of doing it that way in the PR. It is
>> perfectly fine to employ randomized tests, and in this case it might even
>> be fine to just pick couple different timezones like the way it happened in
>> the PR, but we should:
>>
>> 1. Document in the code comment why we did it that way.
>>
>> 2. Use a seed and log the seed, so any test failures can be reproduced
>> deterministically. For this one, it'd be better to pick the seed from a
>> seed environmental variable. If the env variable is not set, set to a
>> random seed.
>>
>>
>>
>> On Mon, Oct 8, 2018 at 3:05 PM Sean Owen <sro...@gmail.com> wrote:
>>
>>> Recently, I've seen 3 pull requests that try to speed up a test suite
>>> that tests a bunch of cases by randomly choosing different subsets of
>>> cases to test on each Jenkins run.
>>>
>>> There's disagreement about whether this is good approach to improving
>>> test runtime. Here's a discussion on one that was committed:
>>> https://github.com/apache/spark/pull/22631/files#r223190476
>>>
>>> I'm flagging it for more input.
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>>>
>>>

Reply via email to