Re: Spark JIRA Report

Andrew Ash Mon, 15 Dec 2014 09:56:16 -0800

Nick,

Putting the N most stale issues into a report like your latest one does
seem like a good way to tackle the wall of text effect that I'm worried
about.


On Sun, Dec 14, 2014 at 12:28 PM, Nicholas Chammas <
[email protected]> wrote:

> Taking after Andrew’s suggestion, perhaps the report can just focus on
> Stale issues (no updates in > 90 days), since those are probably the
> easiest to act on.
>
> For example:
> Stale Issues
> <https://issues.apache.org/jira/issues/?jql=project%20%3D%20SPARK%20AND%20resolution%20%3D%20Unresolved%20AND%20updated%20%3C%3D%20-90d%20ORDER%20BY%20updated%20ASC>
>
>    - [Oct 22, 2012] SPARK-560
>    <https://issues.apache.org/jira/browse/SPARK-560>: Specialize RDDs /
>    iterators
>    - [Oct 22, 2012] SPARK-540
>    <https://issues.apache.org/jira/browse/SPARK-540>: Add API to
>    customize in-memory representation of RDDs
>    - [Oct 22, 2012] SPARK-573
>    <https://issues.apache.org/jira/browse/SPARK-573>: Clarify semantics
>    of the parallelized closures
>    - [Nov 06, 2012] SPARK-609
>    <https://issues.apache.org/jira/browse/SPARK-609>: Add instructions
>    for enabling Akka debug logging
>    - [Dec 17, 2012] SPARK-636
>    <https://issues.apache.org/jira/browse/SPARK-636>: Add mechanism to
>    run system management/configuration tasks on all workers
>
> Andrew,
>
> Does that seem more useful?
>
> Nick
> 
>
> On Sun Dec 14 2014 at 3:20:54 AM Nicholas Chammas <
> [email protected]> wrote:
>
>> I formatted this report using Markdown; I'm open to changing the
>> structure or formatting or reducing the amount of information to make the
>> report more easily consumable.
>>
>> Regarding just sending links or whether this would just be mailing list
>> noise, those are a good questions.
>>
>> I've sent out links before, but I feel from a UX perspective having the
>> information right in the email itself makes it frictionless for people to
>> act on the information. For me, that difference is enough to hook me into
>> spending a few minutes on JIRA vs. just glossing over an email with a link.
>>
>> I wonder if that's also the case for others on this list.
>>
>> If you already spend a good amount of time cleaning up on JIRA, then this
>> report won't be that relevant to you. But given the number and growth of
>> open issues on our tracker, I suspect we could do with quite a few more
>> people chipping in and cleaning up where they can.
>>
>> That's the real problem that this report is intended to help with.
>>
>> Nick
>>
>>
>>
>> On Sun Dec 14 2014 at 2:49:00 AM Andrew Ash <[email protected]> wrote:
>>
>>> The goal of increasing visibility on open issues is a good one.  How is
>>> this different from just a link to Jira though?  Some might say this adds
>>> noise to the mailing list and doesn't contain any information not already
>>> available in Jira.
>>>
>>> The idea seems good but the formatting leaves a little to be desired.
>>> If you aren't opposed to using HTML, I might suggest this more compact
>>> format:
>>>
>>> SPARK-2044 <https://issues.apache.org/jira/browse/SPARK-2044> Pluggable 
>>> interface
>>> for shuffles
>>> SPARK-2365 <https://issues.apache.org/jira/browse/SPARK-2365> Add
>>> IndexedRDD, an efficient updatable key-value
>>> SPARK-3561 <https://issues.apache.org/jira/browse/SPARK-3561> Allow for 
>>> pluggable
>>> execution contexts in Spark
>>>
>>> Andrew
>>>
>>> On Sat, Dec 13, 2014 at 11:31 PM, Nicholas Chammas <
>>> [email protected]> wrote:
>>>
>>>> What do y’all think of a report like this emailed out to the dev list
>>>> on a
>>>> monthly basis?
>>>>
>>>> The goal would be to increase visibility into our open issues and
>>>> encourage
>>>> developers to tend to our issue tracker more frequently.
>>>>
>>>> Nick
>>>>
>>>> There are 1,236 unresolved issues
>>>>
>>> <https://issues.apache.org/jira/issues/?jql=project+%3D+SPAR
>>>> K+AND+resolution+%3D+Unresolved+ORDER+BY+updated+DESC>
>>>
>>>
>>>> in the Spark project on JIRA.
>>>> Recently Updated Issues
>>>>
>>> <https://issues.apache.org/jira/issues/?jql=project%20%3D%
>>>> 20SPARK%20AND%20resolution%20%3D%20Unresolved%20ORDER%20BY%
>>>> 20updated%20DESC>
>>>
>>>
>>>> Type Key Priority Summary Last Updated   Bug SPARK-4841
>>>>
>>> <https://issues.apache.org/jira/browse/SPARK-4841> Major Batch
>>>> serializer
>>>
>>>
>>>> bug in PySpark’s RDD.zip Dec 14, 2014  Question SPARK-4810
>>>>
>>> <https://issues.apache.org/jira/browse/SPARK-4810> Major Failed to run
>>>
>>>
>>>> collect Dec 14, 2014  Bug SPARK-785
>>>>
>>> <https://issues.apache.org/jira/browse/SPARK-785> Major ClosureCleaner
>>>> not
>>>
>>>
>>>> invoked on most PairRDDFunctions Dec 14, 2014  New Feature SPARK-3405
>>>>
>>> <https://issues.apache.org/jira/browse/SPARK-3405> Minor EC2 cluster
>>>
>>>
>>>> creation on VPC Dec 13, 2014  Improvement SPARK-1555
>>>>
>>> <https://issues.apache.org/jira/browse/SPARK-1555> Minor enable
>>>
>>>
>>>> ec2/spark_ec2.py to stop/delete cluster non-interactively Dec 13, 2014
>>>>  Stale
>>>> Issues
>>>>
>>> <https://issues.apache.org/jira/issues/?jql=project%20%3D%
>>>> 20SPARK%20AND%20resolution%20%3D%20Unresolved%20AND%20update
>>>> d%20%3C%3D%20-90d%20ORDER%20BY%20updated%20ASC>
>>>
>>>
>>>> Type Key Priority Summary Last Updated   Bug SPARK-560
>>>>
>>> <https://issues.apache.org/jira/browse/SPARK-560> None Specialize RDDs /
>>>
>>>
>>>> iterators Oct 22, 2012  New Feature SPARK-540
>>>>
>>> <https://issues.apache.org/jira/browse/SPARK-540> None Add API to
>>>> customize
>>>
>>>
>>>> in-memory representation of RDDs Oct 22, 2012  Improvement SPARK-573
>>>>
>>> <https://issues.apache.org/jira/browse/SPARK-573> None Clarify
>>>> semantics of
>>>
>>>
>>>> the parallelized closures Oct 22, 2012  New Feature SPARK-609
>>>>
>>> <https://issues.apache.org/jira/browse/SPARK-609> Minor Add instructions
>>>
>>>
>>>> for enabling Akka debug logging Nov 06, 2012  New Feature SPARK-636
>>>>
>>> <https://issues.apache.org/jira/browse/SPARK-636> Major Add mechanism to
>>>
>>>
>>>> run system management/configuration tasks on all workers Dec 17, 2012
>>>>  Most
>>>> Watched Issues
>>>>
>>> <https://issues.apache.org/jira/issues/?jql=project%20%3D%
>>>> 20SPARK%20AND%20resolution%20%3D%20Unresolved%20ORDER%20BY%
>>>> 20watchers%20DESC>
>>>
>>>
>>>> Type Key Priority Summary Watchers   New Feature SPARK-3561
>>>>
>>> <https://issues.apache.org/jira/browse/SPARK-3561> Major Allow for
>>>
>>>
>>>> pluggable execution contexts in Spark 75  New Feature SPARK-2365
>>>>
>>> <https://issues.apache.org/jira/browse/SPARK-2365> Major Add
>>>> IndexedRDD, an
>>>
>>>
>>>> efficient updatable key-value store 33  Improvement SPARK-2044
>>>>
>>> <https://issues.apache.org/jira/browse/SPARK-2044> Major Pluggable
>>>
>>>
>>>> interface for shuffles 30  New Feature SPARK-1405
>>>>
>>> <https://issues.apache.org/jira/browse/SPARK-1405> Critical parallel
>>>> Latent
>>>
>>>
>>>> Dirichlet Allocation (LDA) atop of spark in MLlib 26  New Feature
>>>> SPARK-1406
>>>>
>>> <https://issues.apache.org/jira/browse/SPARK-1406> Major PMML model
>>>
>>>
>>>> evaluation support via MLib 21   Most Voted Issues
>>>>
>>> <https://issues.apache.org/jira/issues/?jql=project%20%3D%
>>>> 20SPARK%20AND%20resolution%20%3D%20Unresolved%20ORDER%20BY%
>>>> 20votes%20DESC>
>>>
>>>
>>>> Type Key Priority Summary Votes   Bug SPARK-2541
>>>>
>>> <https://issues.apache.org/jira/browse/SPARK-2541> Major Standalone mode
>>>
>>>
>>>> can’t access secure HDFS anymore 12  New Feature SPARK-2365
>>>>
>>> <https://issues.apache.org/jira/browse/SPARK-2365> Major Add
>>>> IndexedRDD, an
>>>
>>>
>>>> efficient updatable key-value store 9  Improvement SPARK-3533
>>>>
>>> <https://issues.apache.org/jira/browse/SPARK-3533> Major Add
>>>
>>>
>>>> saveAsTextFileByKey() method to RDDs 8  Bug SPARK-2883
>>>>
>>> <https://issues.apache.org/jira/browse/SPARK-2883> Blocker Spark Support
>>>
>>>
>>>> for ORCFile format 6  New Feature SPARK-1442
>>>>
>>> <https://issues.apache.org/jira/browse/SPARK-1442> Major Add Window
>>>> function support 6
>>>> 
>>>>
>>>
>>>

Re: Spark JIRA Report

Reply via email to