Re: Adding JIRA ID as the prefix for the test case name

2019-11-12 Thread Hyukjin Kwon
> In general a test should be self descriptive and I don't think we should
be adding JIRA ticket references wholesale. Any action that the reader has
to take to understand why a test was introduced is one too many. However in
some cases the thing we are trying to test is very subtle and in that case
a reference to a JIRA ticket might be useful, I do still feel that this
should be a backstop and that properly documenting your tests is a much
better way of dealing with this.

Yeah, the test should be self-descriptive. I don't think adding a JIRA
prefix harms this point. Probably I should add this sentence in the
guidelines as well.
Adding a JIRA prefix just adds one extra hint to track down details. I
think it's fine to stick to this practice and make it simpler and clear to
follow.

> 1. what if multiple JIRA IDs relating to the same test? we just take the
very first JIRA ID?
Ideally one JIRA should describe one issue and one PR should fix one JIRA
with a dedicated test.
Yeah, I think I would take the very first JIRA ID.

> 2. are we going to have a full scan of all existing tests and attach a
JIRA ID to it?
Yea, let's don't do this.

> It's a nice-to-have, not super essential, just because ...
It's been asked multiple times and each committer seems having a different
understanding on this.
It's not a biggie but wanted to make it clear and conclude this.

> I'd add this only when a test specifically targets a certain issue.
Yes, so this one I am not sure. From what I heard, people adds the JIRA in
cases below:

- Whenever the JIRA type is a bug
- When a PR adds a couple of tests
- Only when a test specifically targets a certain issue.
- ...

Which one do we prefer and simpler to follow?

Or I can combine as below (im gonna reword when I actually document this):
1. In general, we should add a JIRA ID as prefix of a test when a PR
targets to fix a specific issue.
In practice, it usually happens when a JIRA type is a bug or a PR adds
a couple of tests.
2. Uses "SPARK-: test name" format

If we have no objection with ^, let me go with this.

2019년 11월 13일 (수) 오전 8:14, Sean Owen 님이 작성:

> Let's suggest "SPARK-12345:" but not go back and change a bunch of test
> cases.
> I'd add this only when a test specifically targets a certain issue.
> It's a nice-to-have, not super essential, just because in the rare
> case you need to understand why a test asserts something, you can go
> back and find what added it in the git history without much trouble.
>
> On Mon, Nov 11, 2019 at 10:46 AM Hyukjin Kwon  wrote:
> >
> > Hi all,
> >
> > Maybe it's not a big deal but it brought some confusions time to time
> into Spark dev and community. I think it's time to discuss about when/which
> format to add a JIRA ID as a prefix for the test case name in Scala test
> cases.
> >
> > Currently we have many test case names with prefixes as below:
> >
> > test("SPARK-X blah blah")
> > test("SPARK-X: blah blah")
> > test("SPARK-X - blah blah")
> > test("[SPARK-X] blah blah")
> > …
> >
> > It is a good practice to have the JIRA ID in general because, for
> instance,
> > it makes us put less efforts to track commit histories (or even when the
> files
> > are totally moved), or to track related information of tests failed.
> > Considering Spark's getting big, I think it's good to document.
> >
> > I would like to suggest this and document it in our guideline:
> >
> > 1. Add a prefix into a test name when a PR adds a couple of tests.
> > 2. Uses "SPARK-: test name" format which is used in our code base
> most
> >   often[1].
> >
> > We should make it simple and clear but closer to the actual practice.
> So, I would like to listen to what other people think. I would appreciate
> if you guys give some feedback about when to add the JIRA prefix. One
> alternative is that, we only add the prefix when the JIRA's type is bug.
> >
> > [1]
> > git grep -E 'test\("\SPARK-([0-9]+):' | wc -l
> >  923
> > git grep -E 'test\("\SPARK-([0-9]+) ' | wc -l
> >  477
> > git grep -E 'test\("\[SPARK-([0-9]+)\]' | wc -l
> >   16
> > git grep -E 'test\("\SPARK-([0-9]+) -' | wc -l
> >   13
> >
> >
> >
>


Re: [build system] jenkins wedged, needed a quick restart

2019-11-12 Thread Takeshi Yamamuro
thx as always, Shane!


On Wed, Nov 13, 2019 at 3:25 AM Shane Knapp  wrote:

> it's coming back up now.
>
> --
> Shane Knapp
> UC Berkeley EECS Research / RISELab Staff Technical Lead
> https://rise.cs.berkeley.edu
>
> -
> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>
>

-- 
---
Takeshi Yamamuro


Re: Adding JIRA ID as the prefix for the test case name

2019-11-12 Thread Sean Owen
Let's suggest "SPARK-12345:" but not go back and change a bunch of test cases.
I'd add this only when a test specifically targets a certain issue.
It's a nice-to-have, not super essential, just because in the rare
case you need to understand why a test asserts something, you can go
back and find what added it in the git history without much trouble.

On Mon, Nov 11, 2019 at 10:46 AM Hyukjin Kwon  wrote:
>
> Hi all,
>
> Maybe it's not a big deal but it brought some confusions time to time into 
> Spark dev and community. I think it's time to discuss about when/which format 
> to add a JIRA ID as a prefix for the test case name in Scala test cases.
>
> Currently we have many test case names with prefixes as below:
>
> test("SPARK-X blah blah")
> test("SPARK-X: blah blah")
> test("SPARK-X - blah blah")
> test("[SPARK-X] blah blah")
> …
>
> It is a good practice to have the JIRA ID in general because, for instance,
> it makes us put less efforts to track commit histories (or even when the files
> are totally moved), or to track related information of tests failed.
> Considering Spark's getting big, I think it's good to document.
>
> I would like to suggest this and document it in our guideline:
>
> 1. Add a prefix into a test name when a PR adds a couple of tests.
> 2. Uses "SPARK-: test name" format which is used in our code base most
>   often[1].
>
> We should make it simple and clear but closer to the actual practice. So, I 
> would like to listen to what other people think. I would appreciate if you 
> guys give some feedback about when to add the JIRA prefix. One alternative is 
> that, we only add the prefix when the JIRA's type is bug.
>
> [1]
> git grep -E 'test\("\SPARK-([0-9]+):' | wc -l
>  923
> git grep -E 'test\("\SPARK-([0-9]+) ' | wc -l
>  477
> git grep -E 'test\("\[SPARK-([0-9]+)\]' | wc -l
>   16
> git grep -E 'test\("\SPARK-([0-9]+) -' | wc -l
>   13
>
>
>

-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org



Re: Adding JIRA ID as the prefix for the test case name

2019-11-12 Thread Xin Ren
+1

Two confusions to clarify:
1. what if multiple JIRA IDs relating to the same test? we just take the
very first JIRA ID?
2. are we going to have a full scan of all existing tests and attach a JIRA
ID to it?

Thank you Hyukjin :)

On Tue, Nov 12, 2019 at 1:47 PM Dongjoon Hyun 
wrote:

> Thank you for the suggestion, Hyukjin.
>
> Previously, we added Jira IDs for the bug fix PR test cases as Gabor said.
>
> For the new features (and improvements), we didn't add them
>
> because all test cases in the newly added test suite share the same prefix
> JIRA ID in that case.
>
> It might looks redundant.
>
> However, I'm +1 for Hyukjin's original suggestion because we had better
> have the official rule for this in some ways.
>
> Thank you again, Hyukjin.
>
> Bests,
> Dongjoon.
>
>
>
> On Tue, Nov 12, 2019 at 1:13 AM Gabor Somogyi 
> wrote:
>
>> +1 for having that consistent rule in test names.
>> +1 for making it a guideline.
>> +1 defining exact guides in general.
>>
>> Until now I've followed the alternative (only add the prefix when the
>> JIRA's type is bug) and that way I knew that such tests contain edge cases.
>> In case of new features I'm pretty sure there is a reason to introduce it
>> but at the moment can't imagine a use-case where it can help us (want to
>> convert it to daily routine).
>>
>> > This is helpful when the test cases are moved to a different file.
>> The test can be found by name without jira ID
>>
>>
>> On Tue, Nov 12, 2019 at 5:31 AM Hyukjin Kwon  wrote:
>>
>>> In few days, I will wrote this in our guidelines probably after
>>> rewording it a bit better:
>>>
>>> 1. Add a prefix into a test name when a PR adds a couple of tests.
>>> 2. Uses "SPARK-: test name" format.
>>>
>>> Please let me know if you have any different opinion about what/when to
>>> write the JIRA ID as the prefix.
>>> I would like to make sure this simple rule is closer to the actual
>>> practice from you guys.
>>>
>>>
>>> 2019년 11월 12일 (화) 오전 8:41, Gengliang 님이 작성:
>>>
 +1 for making it a guideline. This is helpful when the test cases are
 moved to a different file.

 On Mon, Nov 11, 2019 at 3:23 PM Takeshi Yamamuro 
 wrote:

> +1 for having that consistent rule in test names.
> This is a trivial problem though, I think documenting this rule in the
> contribution guide
> might be able to make reviewer overhead a little smaller.
>
> Bests,
> Takeshi
>
> On Tue, Nov 12, 2019 at 1:46 AM Hyukjin Kwon 
> wrote:
>
>> Hi all,
>>
>> Maybe it's not a big deal but it brought some confusions time to time
>> into Spark dev and community. I think it's time to discuss about 
>> when/which
>> format to add a JIRA ID as a prefix for the test case name in Scala test
>> cases.
>>
>> Currently we have many test case names with prefixes as below:
>>
>>- test("SPARK-X blah blah")
>>- test("SPARK-X: blah blah")
>>- test("SPARK-X - blah blah")
>>- test("[SPARK-X] blah blah")
>>- …
>>
>> It is a good practice to have the JIRA ID in general because, for
>> instance,
>> it makes us put less efforts to track commit histories (or even when
>> the files
>> are totally moved), or to track related information of tests failed.
>> Considering Spark's getting big, I think it's good to document.
>>
>> I would like to suggest this and document it in our guideline:
>>
>> 1. Add a prefix into a test name when a PR adds a couple of tests.
>> 2. Uses "SPARK-: test name" format which is used in our code base
>> most
>>   often[1].
>>
>> We should make it simple and clear but closer to the actual practice.
>> So, I would like to listen to what other people think. I would appreciate
>> if you guys give some feedback about when to add the JIRA prefix. One
>> alternative is that, we only add the prefix when the JIRA's type is bug.
>>
>> [1]
>> git grep -E 'test\("\SPARK-([0-9]+):' | wc -l
>>  923
>> git grep -E 'test\("\SPARK-([0-9]+) ' | wc -l
>>  477
>> git grep -E 'test\("\[SPARK-([0-9]+)\]' | wc -l
>>   16
>> git grep -E 'test\("\SPARK-([0-9]+) -' | wc -l
>>   13
>>
>>
>>
>>
>
> --
> ---
> Takeshi Yamamuro
>



Re: Adding JIRA ID as the prefix for the test case name

2019-11-12 Thread Dongjoon Hyun
Thank you for the suggestion, Hyukjin.

Previously, we added Jira IDs for the bug fix PR test cases as Gabor said.

For the new features (and improvements), we didn't add them

because all test cases in the newly added test suite share the same prefix
JIRA ID in that case.

It might looks redundant.

However, I'm +1 for Hyukjin's original suggestion because we had better
have the official rule for this in some ways.

Thank you again, Hyukjin.

Bests,
Dongjoon.



On Tue, Nov 12, 2019 at 1:13 AM Gabor Somogyi 
wrote:

> +1 for having that consistent rule in test names.
> +1 for making it a guideline.
> +1 defining exact guides in general.
>
> Until now I've followed the alternative (only add the prefix when the
> JIRA's type is bug) and that way I knew that such tests contain edge cases.
> In case of new features I'm pretty sure there is a reason to introduce it
> but at the moment can't imagine a use-case where it can help us (want to
> convert it to daily routine).
>
> > This is helpful when the test cases are moved to a different file.
> The test can be found by name without jira ID
>
>
> On Tue, Nov 12, 2019 at 5:31 AM Hyukjin Kwon  wrote:
>
>> In few days, I will wrote this in our guidelines probably after rewording
>> it a bit better:
>>
>> 1. Add a prefix into a test name when a PR adds a couple of tests.
>> 2. Uses "SPARK-: test name" format.
>>
>> Please let me know if you have any different opinion about what/when to
>> write the JIRA ID as the prefix.
>> I would like to make sure this simple rule is closer to the actual
>> practice from you guys.
>>
>>
>> 2019년 11월 12일 (화) 오전 8:41, Gengliang 님이 작성:
>>
>>> +1 for making it a guideline. This is helpful when the test cases are
>>> moved to a different file.
>>>
>>> On Mon, Nov 11, 2019 at 3:23 PM Takeshi Yamamuro 
>>> wrote:
>>>
 +1 for having that consistent rule in test names.
 This is a trivial problem though, I think documenting this rule in the
 contribution guide
 might be able to make reviewer overhead a little smaller.

 Bests,
 Takeshi

 On Tue, Nov 12, 2019 at 1:46 AM Hyukjin Kwon 
 wrote:

> Hi all,
>
> Maybe it's not a big deal but it brought some confusions time to time
> into Spark dev and community. I think it's time to discuss about 
> when/which
> format to add a JIRA ID as a prefix for the test case name in Scala test
> cases.
>
> Currently we have many test case names with prefixes as below:
>
>- test("SPARK-X blah blah")
>- test("SPARK-X: blah blah")
>- test("SPARK-X - blah blah")
>- test("[SPARK-X] blah blah")
>- …
>
> It is a good practice to have the JIRA ID in general because, for
> instance,
> it makes us put less efforts to track commit histories (or even when
> the files
> are totally moved), or to track related information of tests failed.
> Considering Spark's getting big, I think it's good to document.
>
> I would like to suggest this and document it in our guideline:
>
> 1. Add a prefix into a test name when a PR adds a couple of tests.
> 2. Uses "SPARK-: test name" format which is used in our code base
> most
>   often[1].
>
> We should make it simple and clear but closer to the actual practice.
> So, I would like to listen to what other people think. I would appreciate
> if you guys give some feedback about when to add the JIRA prefix. One
> alternative is that, we only add the prefix when the JIRA's type is bug.
>
> [1]
> git grep -E 'test\("\SPARK-([0-9]+):' | wc -l
>  923
> git grep -E 'test\("\SPARK-([0-9]+) ' | wc -l
>  477
> git grep -E 'test\("\[SPARK-([0-9]+)\]' | wc -l
>   16
> git grep -E 'test\("\SPARK-([0-9]+) -' | wc -l
>   13
>
>
>
>

 --
 ---
 Takeshi Yamamuro

>>>


Re: ASF board report for November 2019

2019-11-12 Thread Matei Zaharia
Oops, sorry about the typo there; I’ll correct that.

> On Nov 12, 2019, at 12:43 AM, ruifengz  wrote:
> 
> nit: Ruifeng Zhang as committers in the past three months. <- Ruifeng Zheng
> 
> ☺Thanks
> 
> On 11/12/19 3:54 PM, Matei Zaharia wrote:
>> Good catch, thanks.
>> 
>>> On Nov 11, 2019, at 6:46 PM, Jungtaek Lim >> > wrote:
>>> 
>>> nit: - The latest committer was added on Sept 4th, 2019 (Dongjoon Hyun). <= 
>>> s/committer/PMC member
>>> 
>>> Thanks,
>>> Jungtaek Lim (HeartSaVioR)
>>> 
>>> On Tue, Nov 12, 2019 at 11:38 AM Matei Zaharia >> > wrote:
>>> Hi all,
>>> 
>>> It’s time to send our quarterly report to the ASF board. Here is my draft — 
>>> please feel free to suggest any changes.
>>> 
>>> 
>>> 
>>> Apache Spark is a fast and general engine for large-scale data processing. 
>>> It
>>> offers high-level APIs in Java, Scala, Python and R as well as a rich set of
>>> libraries including stream processing, machine learning, and graph 
>>> analytics.
>>> 
>>> Project status:
>>> 
>>> - We made the first preview release for Spark 3.0 on November 6th. This
>>>   release aims to get early feedback on the new APIs and functionality
>>>   targeting Spark 3.0 but does not provide API or stability guarantees. We
>>>   encourage community members to try this release and leave feedback on
>>>   JIRA. More info about what’s new and how to report feedback is found at
>>>   https://spark.apache.org/news/spark-3.0.0-preview.html 
>>> .
>>> 
>>> - We published Spark 2.4.4. and 2.3.4 as maintenance releases to fix bugs
>>>   in the 2.4 and 2.3 branches.
>>> 
>>> - We added one new PMC members and six committers to the project
>>>   in August and September, covering data sources, streaming, SQL, ML
>>>   and other components of the project.
>>> 
>>> Trademarks:
>>> 
>>> - Nothing new to report since August.
>>> 
>>> Latest releases:
>>> 
>>> - Spark 3.0.0-preview was released on Nov 6th, 2019.
>>> - Spark 2.3.4 was released on Sept 9th, 2019.
>>> - Spark 2.4.4 was released on Sept 1st, 2019.
>>> 
>>> Committers and PMC:
>>> 
>>> - The latest committer was added on Sept 4th, 2019 (Dongjoon Hyun).
>>> - The latest committer was added on Sept 9th, 2019 (Weichen Xu). We
>>>   also added Ryan Blue, L.C. Hsieh, Gengliang Wang, Yuming Wang and
>>>   Ruifeng Zhang as committers in the past three months.
>>> 
>>> 
>>> -
>>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org 
>>> 
>>> 
>> 



[build system] jenkins wedged, needed a quick restart

2019-11-12 Thread Shane Knapp
it's coming back up now.

-- 
Shane Knapp
UC Berkeley EECS Research / RISELab Staff Technical Lead
https://rise.cs.berkeley.edu

-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org



Re: Adding JIRA ID as the prefix for the test case name

2019-11-12 Thread Gabor Somogyi
+1 for having that consistent rule in test names.
+1 for making it a guideline.
+1 defining exact guides in general.

Until now I've followed the alternative (only add the prefix when the
JIRA's type is bug) and that way I knew that such tests contain edge cases.
In case of new features I'm pretty sure there is a reason to introduce it
but at the moment can't imagine a use-case where it can help us (want to
convert it to daily routine).

> This is helpful when the test cases are moved to a different file.
The test can be found by name without jira ID


On Tue, Nov 12, 2019 at 5:31 AM Hyukjin Kwon  wrote:

> In few days, I will wrote this in our guidelines probably after rewording
> it a bit better:
>
> 1. Add a prefix into a test name when a PR adds a couple of tests.
> 2. Uses "SPARK-: test name" format.
>
> Please let me know if you have any different opinion about what/when to
> write the JIRA ID as the prefix.
> I would like to make sure this simple rule is closer to the actual
> practice from you guys.
>
>
> 2019년 11월 12일 (화) 오전 8:41, Gengliang 님이 작성:
>
>> +1 for making it a guideline. This is helpful when the test cases are
>> moved to a different file.
>>
>> On Mon, Nov 11, 2019 at 3:23 PM Takeshi Yamamuro 
>> wrote:
>>
>>> +1 for having that consistent rule in test names.
>>> This is a trivial problem though, I think documenting this rule in the
>>> contribution guide
>>> might be able to make reviewer overhead a little smaller.
>>>
>>> Bests,
>>> Takeshi
>>>
>>> On Tue, Nov 12, 2019 at 1:46 AM Hyukjin Kwon 
>>> wrote:
>>>
 Hi all,

 Maybe it's not a big deal but it brought some confusions time to time
 into Spark dev and community. I think it's time to discuss about when/which
 format to add a JIRA ID as a prefix for the test case name in Scala test
 cases.

 Currently we have many test case names with prefixes as below:

- test("SPARK-X blah blah")
- test("SPARK-X: blah blah")
- test("SPARK-X - blah blah")
- test("[SPARK-X] blah blah")
- …

 It is a good practice to have the JIRA ID in general because, for
 instance,
 it makes us put less efforts to track commit histories (or even when
 the files
 are totally moved), or to track related information of tests failed.
 Considering Spark's getting big, I think it's good to document.

 I would like to suggest this and document it in our guideline:

 1. Add a prefix into a test name when a PR adds a couple of tests.
 2. Uses "SPARK-: test name" format which is used in our code base
 most
   often[1].

 We should make it simple and clear but closer to the actual practice.
 So, I would like to listen to what other people think. I would appreciate
 if you guys give some feedback about when to add the JIRA prefix. One
 alternative is that, we only add the prefix when the JIRA's type is bug.

 [1]
 git grep -E 'test\("\SPARK-([0-9]+):' | wc -l
  923
 git grep -E 'test\("\SPARK-([0-9]+) ' | wc -l
  477
 git grep -E 'test\("\[SPARK-([0-9]+)\]' | wc -l
   16
 git grep -E 'test\("\SPARK-([0-9]+) -' | wc -l
   13




>>>
>>> --
>>> ---
>>> Takeshi Yamamuro
>>>
>>


Re: ASF board report for November 2019

2019-11-12 Thread ruifengz

nit: Ruifeng Zhang as committers in the past three months. <- Ruifeng Zheng

☺Thanks

On 11/12/19 3:54 PM, Matei Zaharia wrote:

Good catch, thanks.

On Nov 11, 2019, at 6:46 PM, Jungtaek Lim 
mailto:kabhwan.opensou...@gmail.com>> 
wrote:


nit: - The latest committer was added on Sept 4th, 2019 (Dongjoon 
Hyun). <= s/committer/PMC member


Thanks,
Jungtaek Lim (HeartSaVioR)

On Tue, Nov 12, 2019 at 11:38 AM Matei Zaharia 
mailto:matei.zaha...@gmail.com>> wrote:


Hi all,

It’s time to send our quarterly report to the ASF board. Here is
my draft — please feel free to suggest any changes.



Apache Spark is a fast and general engine for large-scale data
processing. It
offers high-level APIs in Java, Scala, Python and R as well as a
rich set of
libraries including stream processing, machine learning, and
graph analytics.

Project status:

- We made the first preview release for Spark 3.0 on November
6th. This
  release aims to get early feedback on the new APIs and
functionality
  targeting Spark 3.0 but does not provide API or stability
guarantees. We
  encourage community members to try this release and leave
feedback on
  JIRA. More info about what’s new and how to report feedback is
found at
https://spark.apache.org/news/spark-3.0.0-preview.html.

- We published Spark 2.4.4. and 2.3.4 as maintenance releases to
fix bugs
  in the 2.4 and 2.3 branches.

- We added one new PMC members and six committers to the project
  in August and September, covering data sources, streaming, SQL, ML
  and other components of the project.

Trademarks:

- Nothing new to report since August.

Latest releases:

- Spark 3.0.0-preview was released on Nov 6th, 2019.
- Spark 2.3.4 was released on Sept 9th, 2019.
- Spark 2.4.4 was released on Sept 1st, 2019.

Committers and PMC:

- The latest committer was added on Sept 4th, 2019 (Dongjoon Hyun).
- The latest committer was added on Sept 9th, 2019 (Weichen Xu). We
  also added Ryan Blue, L.C. Hsieh, Gengliang Wang, Yuming Wang and
  Ruifeng Zhang as committers in the past three months.


-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org