Taking after Andrew’s suggestion, perhaps the report can just focus on Stale issues (no updates in > 90 days), since those are probably the easiest to act on.
For example: Stale Issues <https://issues.apache.org/jira/issues/?jql=project%20%3D%20SPARK%20AND%20resolution%20%3D%20Unresolved%20AND%20updated%20%3C%3D%20-90d%20ORDER%20BY%20updated%20ASC> - [Oct 22, 2012] SPARK-560 <https://issues.apache.org/jira/browse/SPARK-560>: Specialize RDDs / iterators - [Oct 22, 2012] SPARK-540 <https://issues.apache.org/jira/browse/SPARK-540>: Add API to customize in-memory representation of RDDs - [Oct 22, 2012] SPARK-573 <https://issues.apache.org/jira/browse/SPARK-573>: Clarify semantics of the parallelized closures - [Nov 06, 2012] SPARK-609 <https://issues.apache.org/jira/browse/SPARK-609>: Add instructions for enabling Akka debug logging - [Dec 17, 2012] SPARK-636 <https://issues.apache.org/jira/browse/SPARK-636>: Add mechanism to run system management/configuration tasks on all workers Andrew, Does that seem more useful? Nick On Sun Dec 14 2014 at 3:20:54 AM Nicholas Chammas < [email protected]> wrote: > I formatted this report using Markdown; I'm open to changing the structure > or formatting or reducing the amount of information to make the report more > easily consumable. > > Regarding just sending links or whether this would just be mailing list > noise, those are a good questions. > > I've sent out links before, but I feel from a UX perspective having the > information right in the email itself makes it frictionless for people to > act on the information. For me, that difference is enough to hook me into > spending a few minutes on JIRA vs. just glossing over an email with a link. > > I wonder if that's also the case for others on this list. > > If you already spend a good amount of time cleaning up on JIRA, then this > report won't be that relevant to you. But given the number and growth of > open issues on our tracker, I suspect we could do with quite a few more > people chipping in and cleaning up where they can. > > That's the real problem that this report is intended to help with. > > Nick > > > > On Sun Dec 14 2014 at 2:49:00 AM Andrew Ash <[email protected]> wrote: > >> The goal of increasing visibility on open issues is a good one. How is >> this different from just a link to Jira though? Some might say this adds >> noise to the mailing list and doesn't contain any information not already >> available in Jira. >> >> The idea seems good but the formatting leaves a little to be desired. If >> you aren't opposed to using HTML, I might suggest this more compact format: >> >> SPARK-2044 <https://issues.apache.org/jira/browse/SPARK-2044> Pluggable >> interface >> for shuffles >> SPARK-2365 <https://issues.apache.org/jira/browse/SPARK-2365> Add >> IndexedRDD, an efficient updatable key-value >> SPARK-3561 <https://issues.apache.org/jira/browse/SPARK-3561> Allow for >> pluggable >> execution contexts in Spark >> >> Andrew >> >> On Sat, Dec 13, 2014 at 11:31 PM, Nicholas Chammas < >> [email protected]> wrote: >> >>> What do y’all think of a report like this emailed out to the dev list on >>> a >>> monthly basis? >>> >>> The goal would be to increase visibility into our open issues and >>> encourage >>> developers to tend to our issue tracker more frequently. >>> >>> Nick >>> >>> There are 1,236 unresolved issues >>> >> <https://issues.apache.org/jira/issues/?jql=project+%3D+SPAR >>> K+AND+resolution+%3D+Unresolved+ORDER+BY+updated+DESC> >> >> >>> in the Spark project on JIRA. >>> Recently Updated Issues >>> >> <https://issues.apache.org/jira/issues/?jql=project%20%3D% >>> 20SPARK%20AND%20resolution%20%3D%20Unresolved%20ORDER%20BY% >>> 20updated%20DESC> >> >> >>> Type Key Priority Summary Last Updated Bug SPARK-4841 >>> >> <https://issues.apache.org/jira/browse/SPARK-4841> Major Batch serializer >> >> >>> bug in PySpark’s RDD.zip Dec 14, 2014 Question SPARK-4810 >>> >> <https://issues.apache.org/jira/browse/SPARK-4810> Major Failed to run >> >> >>> collect Dec 14, 2014 Bug SPARK-785 >>> >> <https://issues.apache.org/jira/browse/SPARK-785> Major ClosureCleaner >>> not >> >> >>> invoked on most PairRDDFunctions Dec 14, 2014 New Feature SPARK-3405 >>> >> <https://issues.apache.org/jira/browse/SPARK-3405> Minor EC2 cluster >> >> >>> creation on VPC Dec 13, 2014 Improvement SPARK-1555 >>> >> <https://issues.apache.org/jira/browse/SPARK-1555> Minor enable >> >> >>> ec2/spark_ec2.py to stop/delete cluster non-interactively Dec 13, 2014 >>> Stale >>> Issues >>> >> <https://issues.apache.org/jira/issues/?jql=project%20%3D% >>> 20SPARK%20AND%20resolution%20%3D%20Unresolved%20AND%20update >>> d%20%3C%3D%20-90d%20ORDER%20BY%20updated%20ASC> >> >> >>> Type Key Priority Summary Last Updated Bug SPARK-560 >>> >> <https://issues.apache.org/jira/browse/SPARK-560> None Specialize RDDs / >> >> >>> iterators Oct 22, 2012 New Feature SPARK-540 >>> >> <https://issues.apache.org/jira/browse/SPARK-540> None Add API to >>> customize >> >> >>> in-memory representation of RDDs Oct 22, 2012 Improvement SPARK-573 >>> >> <https://issues.apache.org/jira/browse/SPARK-573> None Clarify semantics >>> of >> >> >>> the parallelized closures Oct 22, 2012 New Feature SPARK-609 >>> >> <https://issues.apache.org/jira/browse/SPARK-609> Minor Add instructions >> >> >>> for enabling Akka debug logging Nov 06, 2012 New Feature SPARK-636 >>> >> <https://issues.apache.org/jira/browse/SPARK-636> Major Add mechanism to >> >> >>> run system management/configuration tasks on all workers Dec 17, 2012 >>> Most >>> Watched Issues >>> >> <https://issues.apache.org/jira/issues/?jql=project%20%3D% >>> 20SPARK%20AND%20resolution%20%3D%20Unresolved%20ORDER%20BY% >>> 20watchers%20DESC> >> >> >>> Type Key Priority Summary Watchers New Feature SPARK-3561 >>> >> <https://issues.apache.org/jira/browse/SPARK-3561> Major Allow for >> >> >>> pluggable execution contexts in Spark 75 New Feature SPARK-2365 >>> >> <https://issues.apache.org/jira/browse/SPARK-2365> Major Add IndexedRDD, >>> an >> >> >>> efficient updatable key-value store 33 Improvement SPARK-2044 >>> >> <https://issues.apache.org/jira/browse/SPARK-2044> Major Pluggable >> >> >>> interface for shuffles 30 New Feature SPARK-1405 >>> >> <https://issues.apache.org/jira/browse/SPARK-1405> Critical parallel >>> Latent >> >> >>> Dirichlet Allocation (LDA) atop of spark in MLlib 26 New Feature >>> SPARK-1406 >>> >> <https://issues.apache.org/jira/browse/SPARK-1406> Major PMML model >> >> >>> evaluation support via MLib 21 Most Voted Issues >>> >> <https://issues.apache.org/jira/issues/?jql=project%20%3D% >>> 20SPARK%20AND%20resolution%20%3D%20Unresolved%20ORDER%20BY% >>> 20votes%20DESC> >> >> >>> Type Key Priority Summary Votes Bug SPARK-2541 >>> >> <https://issues.apache.org/jira/browse/SPARK-2541> Major Standalone mode >> >> >>> can’t access secure HDFS anymore 12 New Feature SPARK-2365 >>> >> <https://issues.apache.org/jira/browse/SPARK-2365> Major Add IndexedRDD, >>> an >> >> >>> efficient updatable key-value store 9 Improvement SPARK-3533 >>> >> <https://issues.apache.org/jira/browse/SPARK-3533> Major Add >> >> >>> saveAsTextFileByKey() method to RDDs 8 Bug SPARK-2883 >>> >> <https://issues.apache.org/jira/browse/SPARK-2883> Blocker Spark Support >> >> >>> for ORCFile format 6 New Feature SPARK-1442 >>> >> <https://issues.apache.org/jira/browse/SPARK-1442> Major Add Window >>> function support 6 >>> >>> >> >>
