Re: Spark JIRA Report

Nicholas Chammas Sun, 14 Dec 2014 12:30:07 -0800

Taking after Andrew’s suggestion, perhaps the report can just focus on
Stale issues (no updates in > 90 days), since those are probably the
easiest to act on.


For example:
Stale Issues
<https://issues.apache.org/jira/issues/?jql=project%20%3D%20SPARK%20AND%20resolution%20%3D%20Unresolved%20AND%20updated%20%3C%3D%20-90d%20ORDER%20BY%20updated%20ASC>

   - [Oct 22, 2012] SPARK-560
   <https://issues.apache.org/jira/browse/SPARK-560>: Specialize RDDs /
   iterators
   - [Oct 22, 2012] SPARK-540
   <https://issues.apache.org/jira/browse/SPARK-540>: Add API to customize
   in-memory representation of RDDs
   - [Oct 22, 2012] SPARK-573
   <https://issues.apache.org/jira/browse/SPARK-573>: Clarify semantics of
   the parallelized closures
   - [Nov 06, 2012] SPARK-609
   <https://issues.apache.org/jira/browse/SPARK-609>: Add instructions for
   enabling Akka debug logging
   - [Dec 17, 2012] SPARK-636
   <https://issues.apache.org/jira/browse/SPARK-636>: Add mechanism to run
   system management/configuration tasks on all workers

Andrew,

Does that seem more useful?

Nick


On Sun Dec 14 2014 at 3:20:54 AM Nicholas Chammas <
nicholas.cham...@gmail.com> wrote:

> I formatted this report using Markdown; I'm open to changing the structure
> or formatting or reducing the amount of information to make the report more
> easily consumable.
>
> Regarding just sending links or whether this would just be mailing list
> noise, those are a good questions.
>
> I've sent out links before, but I feel from a UX perspective having the
> information right in the email itself makes it frictionless for people to
> act on the information. For me, that difference is enough to hook me into
> spending a few minutes on JIRA vs. just glossing over an email with a link.
>
> I wonder if that's also the case for others on this list.
>
> If you already spend a good amount of time cleaning up on JIRA, then this
> report won't be that relevant to you. But given the number and growth of
> open issues on our tracker, I suspect we could do with quite a few more
> people chipping in and cleaning up where they can.
>
> That's the real problem that this report is intended to help with.
>
> Nick
>
>
>
> On Sun Dec 14 2014 at 2:49:00 AM Andrew Ash <and...@andrewash.com> wrote:
>
>> The goal of increasing visibility on open issues is a good one.  How is
>> this different from just a link to Jira though?  Some might say this adds
>> noise to the mailing list and doesn't contain any information not already
>> available in Jira.
>>
>> The idea seems good but the formatting leaves a little to be desired.  If
>> you aren't opposed to using HTML, I might suggest this more compact format:
>>
>> SPARK-2044 <https://issues.apache.org/jira/browse/SPARK-2044> Pluggable 
>> interface
>> for shuffles
>> SPARK-2365 <https://issues.apache.org/jira/browse/SPARK-2365> Add
>> IndexedRDD, an efficient updatable key-value
>> SPARK-3561 <https://issues.apache.org/jira/browse/SPARK-3561> Allow for 
>> pluggable
>> execution contexts in Spark
>>
>> Andrew
>>
>> On Sat, Dec 13, 2014 at 11:31 PM, Nicholas Chammas <
>> nicholas.cham...@gmail.com> wrote:
>>
>>> What do y’all think of a report like this emailed out to the dev list on
>>> a
>>> monthly basis?
>>>
>>> The goal would be to increase visibility into our open issues and
>>> encourage
>>> developers to tend to our issue tracker more frequently.
>>>
>>> Nick
>>>
>>> There are 1,236 unresolved issues
>>>
>> <https://issues.apache.org/jira/issues/?jql=project+%3D+SPAR
>>> K+AND+resolution+%3D+Unresolved+ORDER+BY+updated+DESC>
>>
>>
>>> in the Spark project on JIRA.
>>> Recently Updated Issues
>>>
>> <https://issues.apache.org/jira/issues/?jql=project%20%3D%
>>> 20SPARK%20AND%20resolution%20%3D%20Unresolved%20ORDER%20BY%
>>> 20updated%20DESC>
>>
>>
>>> Type Key Priority Summary Last Updated   Bug SPARK-4841
>>>
>> <https://issues.apache.org/jira/browse/SPARK-4841> Major Batch serializer
>>
>>
>>> bug in PySpark’s RDD.zip Dec 14, 2014  Question SPARK-4810
>>>
>> <https://issues.apache.org/jira/browse/SPARK-4810> Major Failed to run
>>
>>
>>> collect Dec 14, 2014  Bug SPARK-785
>>>
>> <https://issues.apache.org/jira/browse/SPARK-785> Major ClosureCleaner
>>> not
>>
>>
>>> invoked on most PairRDDFunctions Dec 14, 2014  New Feature SPARK-3405
>>>
>> <https://issues.apache.org/jira/browse/SPARK-3405> Minor EC2 cluster
>>
>>
>>> creation on VPC Dec 13, 2014  Improvement SPARK-1555
>>>
>> <https://issues.apache.org/jira/browse/SPARK-1555> Minor enable
>>
>>
>>> ec2/spark_ec2.py to stop/delete cluster non-interactively Dec 13, 2014
>>>  Stale
>>> Issues
>>>
>> <https://issues.apache.org/jira/issues/?jql=project%20%3D%
>>> 20SPARK%20AND%20resolution%20%3D%20Unresolved%20AND%20update
>>> d%20%3C%3D%20-90d%20ORDER%20BY%20updated%20ASC>
>>
>>
>>> Type Key Priority Summary Last Updated   Bug SPARK-560
>>>
>> <https://issues.apache.org/jira/browse/SPARK-560> None Specialize RDDs /
>>
>>
>>> iterators Oct 22, 2012  New Feature SPARK-540
>>>
>> <https://issues.apache.org/jira/browse/SPARK-540> None Add API to
>>> customize
>>
>>
>>> in-memory representation of RDDs Oct 22, 2012  Improvement SPARK-573
>>>
>> <https://issues.apache.org/jira/browse/SPARK-573> None Clarify semantics
>>> of
>>
>>
>>> the parallelized closures Oct 22, 2012  New Feature SPARK-609
>>>
>> <https://issues.apache.org/jira/browse/SPARK-609> Minor Add instructions
>>
>>
>>> for enabling Akka debug logging Nov 06, 2012  New Feature SPARK-636
>>>
>> <https://issues.apache.org/jira/browse/SPARK-636> Major Add mechanism to
>>
>>
>>> run system management/configuration tasks on all workers Dec 17, 2012
>>>  Most
>>> Watched Issues
>>>
>> <https://issues.apache.org/jira/issues/?jql=project%20%3D%
>>> 20SPARK%20AND%20resolution%20%3D%20Unresolved%20ORDER%20BY%
>>> 20watchers%20DESC>
>>
>>
>>> Type Key Priority Summary Watchers   New Feature SPARK-3561
>>>
>> <https://issues.apache.org/jira/browse/SPARK-3561> Major Allow for
>>
>>
>>> pluggable execution contexts in Spark 75  New Feature SPARK-2365
>>>
>> <https://issues.apache.org/jira/browse/SPARK-2365> Major Add IndexedRDD,
>>> an
>>
>>
>>> efficient updatable key-value store 33  Improvement SPARK-2044
>>>
>> <https://issues.apache.org/jira/browse/SPARK-2044> Major Pluggable
>>
>>
>>> interface for shuffles 30  New Feature SPARK-1405
>>>
>> <https://issues.apache.org/jira/browse/SPARK-1405> Critical parallel
>>> Latent
>>
>>
>>> Dirichlet Allocation (LDA) atop of spark in MLlib 26  New Feature
>>> SPARK-1406
>>>
>> <https://issues.apache.org/jira/browse/SPARK-1406> Major PMML model
>>
>>
>>> evaluation support via MLib 21   Most Voted Issues
>>>
>> <https://issues.apache.org/jira/issues/?jql=project%20%3D%
>>> 20SPARK%20AND%20resolution%20%3D%20Unresolved%20ORDER%20BY%
>>> 20votes%20DESC>
>>
>>
>>> Type Key Priority Summary Votes   Bug SPARK-2541
>>>
>> <https://issues.apache.org/jira/browse/SPARK-2541> Major Standalone mode
>>
>>
>>> can’t access secure HDFS anymore 12  New Feature SPARK-2365
>>>
>> <https://issues.apache.org/jira/browse/SPARK-2365> Major Add IndexedRDD,
>>> an
>>
>>
>>> efficient updatable key-value store 9  Improvement SPARK-3533
>>>
>> <https://issues.apache.org/jira/browse/SPARK-3533> Major Add
>>
>>
>>> saveAsTextFileByKey() method to RDDs 8  Bug SPARK-2883
>>>
>> <https://issues.apache.org/jira/browse/SPARK-2883> Blocker Spark Support
>>
>>
>>> for ORCFile format 6  New Feature SPARK-1442
>>>
>> <https://issues.apache.org/jira/browse/SPARK-1442> Major Add Window
>>> function support 6
>>> 
>>>
>>
>>

Re: Spark JIRA Report

Reply via email to