Re: [Mailman-Developers] [GSoC 2012] Metrics

2012-08-13 Thread George Chatzisofroniou

This is probably my last report. Any comments about the project are
very appreciated. :)

GSoC is almost over and i just pushed all the work i’ve done these
three months. It was one of the greatest experiences and i’ve learned
a ton of stuff. I feel proud of my work and i can’t wait to see it
used in Mailman working environments.

For those who don’t know what is all about, a live demo of the app
exists here [1]. The code is hosted in Launchpad [2] and if you are
interested to learn more, i would suggest you start reading the
documentation, from the installation [3], to metrics collector [4] and
metrics display [5]. To learn even more about the functionality of the
app, please check the tests that lie in the ‘test app’ directory for
each submodule.

I would like to thank Wacky (a great mentor) and the rest of Mailman
developers for their help. Mailman has a great community and of course
i’ll stay here for anything regarding improvements of my app or
anything else Mailman-related.


George Chatzisofroniou
Mailman-Developers mailing list
Mailman FAQ:
Searchable Archives:

Security Policy:

Re: [Mailman-Developers] [GSoC 2012] HyperKitty

2012-08-13 Thread Aamir Khan

We have reached the end of GSoC. I have pushed all the code to bzr repo.
There are tworepositories, one is django application [1] and other one is
django standalone project [2]. It's almost the same way postorius manages
it's code in two separate repositories.

I never worked with django before GSoC. It was one of the greatest learning
experience for me. I learnt a lot about Django. When GSoC started the code
of HyperKitty was very raw and many things were hard coded (urls) and there
was no clear code structure. During my GSoC project I have implemented
features in HyperKitty as well as tried to fix these problems. We now have
a clear code structure, app is separate from project, urls are not hard
coded, unit tests are there in place. Though, HyperKitty is a big project
and we are not close to any release candidate yet, there are essential
features missing from HyperKitty for example we have to connect it to MM3,
plug KittyStore to MM3, cleaner UI.

As always you can check the demo here [3]. (We are aware of the issue that
it throws 500 errors a lot of times :( )

I would like to thank pingou, toshio and the rest of Mailman community
for their
help. As my school year had started I won't be able to work full time on
this project but I will keep pushing changes to HyperKitty when I will have
free time.


On Sun, Jul 8, 2012 at 5:22 AM, Aamir Khan wrote:


 It has been some time since the last update about HyperKitty. I have
 deployed[1] the latest code at for everyone to
 take a look. Please file any bugs/missing features here[2]

 HyperKitty offer features like upvoting/downvoting of messages, adding
 tags (to be implemented soon!) to threads, user profiles and other user
 specific features. We require users to be logged in before using any of
 these features. So, the first task for my GSoC project was to implement
 login procedure for hyperkitty. It uses django-social-auth application [3]
 to implement login functionality. Currently, there are four ways of logging
 into hyperkitty test server namely, Google openid, yahoo openid, browserid,
 and registering the user on the site itself. I have not enabled other ways
 of logging in (e.g, twitter) but it should be fairly easy to enable those.

 I have written a blog post about the essential features an archiver should
 posses[4] in my opinion. The second most important thing for me is the
 ability to promote good content posted on the list. An archiver should be
 intelligent enough to suppress the spamy content and bring good content
 on the surface. For this we have implemented a feature in which a user can
 upvote/downvote a message. The users accessing a particular thread will be
 able to sort the messages based on votes of other users.

 I have also implemented the basic user profiles in which you can see your
 personal info and the messages you rated. You will be able to edit the user
 profiles soon.

 I have also started to write unit tests [5].

 Best regards,


Aamir Khan | 3rd Year  | Computer Science  Engineering | IIT Roorkee
Mailman-Developers mailing list
Mailman FAQ:
Searchable Archives:

Security Policy:

Re: [Mailman-Developers] [GSoC 2012] Metrics

2012-07-30 Thread George Chatzisofroniou

Following a new report. I've finished a coalecor daemon and i wrote
some generators for KittyStore too. I'm now into establishing a real
MM3 connection.

The first version of coalescor is ready!

Coalescor is a custom Django admin command that is responsible for
maintaining the number of entries in the database at a reasonable
level. If the number of entries in the database is not a concern, its
use is completely optional.

There are five parameters which determine the behavior of the
represents the number of grains for which that level of detail is

The daemon coalesces only the new data from its last operation date
and it was designed in a manner not to be memory intensive.

The last two days i also worked on generators. I had already created a
script that generates metrics from a mailbox. There is a KittyStore
API availabe, so it wasn’t hard to also generate metrics from SQL or

George Chatzisofroniou
Mailman-Developers mailing list
Mailman FAQ:
Searchable Archives:

Security Policy:

Re: [Mailman-Developers] [GSoC 2012] Metrics

2012-07-15 Thread George Chatzisofroniou

Here's my new report. I've finally published my code and created some
samples for all of you. Youhou! :) Any feedback is very welcome.

I’m happy to report that the first version of the software is now
public. I have set up the app with some graph samples here [1], and
hosted the code in Launchpad [2].

Above each graph, there is the code snippet that is being used to
query the database. There are many syntax rules (most of them are
already working) that allow a rich vocabulary for the expression of
queries. You may see the BNF language description in the section ‘How
to use them’ in the documentation [3].

The JS library that renders the graphs is jqplot. It’s not difficult
(even for someone with minor experience to Django) to use another
plotting library. The data structure returned by the extraction tag is
very flexible and can be used in any manner. See the section
‘Configuring the output’ in the documentation [3].

My next step will be to implement the coalescor which (optionally)
merges database entries in order to limit the growth of the number of
entries stored in the database

Thanks for reading and testing. I would highly appreciate any comments.


George Chatzisofroniou
Mailman-Developers mailing list
Mailman FAQ:
Searchable Archives:

Security Policy:

Re: [Mailman-Developers] [GSoC 2012] Metrics

2012-07-07 Thread George Chatzisofroniou
Hi everyone,

Here's my new report [1] just before the midterm evaluation.

Ι am before the midterm evaluation and i’m having fun coding and

The project has reached the first version and achieves most of the
targets set. Specifically:

Metrics store is ready and it is designed in a way to make it easy to
add new metrics.  Every metric is either an event counter (eg posts
sent) or a measured level (eg number of total subscribers) and it is
stored in the database representing an interval of a granularity.
Special methods that retrieve counts for the specified interval or tally
the events with their effects (like a message sent or a new
subscription) are also completed.

Most of the graph tags are completed. In order to make it easy for the
designer to output the metrics he wishes, i have created a language and
its parser.  For example, if the designer wishes to output the data of
monthly number of posts for the last two years, he can easily extract
these counts as follows: “ {% EXTRACT AS graph1 %} posts MONTHLY FOR 2
YEARS {% ENDEXTRACT %} “. The whole syntax of tag’s language is defined
by a BNF Language description (you may find an earlier version of this
language in a previous post [2]).

Usually, the designer will add a template specification for the graph
tag and the returned values will be placed in this context, but is is
also possible to customize the way this context is presented or create
his own template (For example, he can easily change the date format or
publish the metrics through a table), since the ’EXTRACT’ tag actually
returns a complex data structure that can be used in any manner.

However, some templates are already created for the designer, with the
most significant to be the one that outputs the graphs. I used jqplot
library [3] (minimal and fast rendered graphs) to render the graphs. So, in
the previous example, to build the graph from the extracted data, the
designer simply has to add the following to his template: “ {% include
“MM3/line_graph.js” with dataset=graph1 title=’List Activity’ %}”, after
loading the appropriate plotting libraries (there are ready templates
that can be included to do that).

For both metrics store and graph tags, i have already created a number
of tests. But i still need to add even more.

There are some things that have not need be done yet and i’ll work on
them in the following days. I need to complete the implementation of the
language (there are some rules that my parser does not handle yet).  I
also want to create a template for bar graphs as well, design a
coalescor that will maintain the number of entries in the database at a
reasonable level and finish my generator that will generate the metrics
from scratch (a very first version is done). Last, i need to use
IArchiver to make a real connection with MM3 core (i currently use a
simulator for testing purposes).


George Chatzisofroniou
Mailman-Developers mailing list
Mailman FAQ:
Searchable Archives:

Security Policy:

[Mailman-Developers] [GSoC 2012] HyperKitty

2012-07-07 Thread Aamir Khan

It has been some time since the last update about HyperKitty. I have
deployed[1] the latest code at for everyone to
take a look. Please file any bugs/missing features here[2]

HyperKitty offer features like upvoting/downvoting of messages, adding tags
(to be implemented soon!) to threads, user profiles and other user specific
features. We require users to be logged in before using any of these
features. So, the first task for my GSoC project was to implement login
procedure for hyperkitty. It uses django-social-auth application [3] to
implement login functionality. Currently, there are four ways of logging
into hyperkitty test server namely, Google openid, yahoo openid, browserid,
and registering the user on the site itself. I have not enabled other ways
of logging in (e.g, twitter) but it should be fairly easy to enable those.

I have written a blog post about the essential features an archiver should
posses[4] in my opinion. The second most important thing for me is the
ability to promote good content posted on the list. An archiver should be
intelligent enough to suppress the spamy content and bring good content
on the surface. For this we have implemented a feature in which a user can
upvote/downvote a message. The users accessing a particular thread will be
able to sort the messages based on votes of other users.

I have also implemented the basic user profiles in which you can see your
personal info and the messages you rated. You will be able to edit the user
profiles soon.

I have also started to write unit tests [5].

Best regards,

Mailman-Developers mailing list
Mailman FAQ:
Searchable Archives:

Security Policy:

Re: [Mailman-Developers] [GSoC 2012] Metrics

2012-06-26 Thread Patrick Ben Koetter
* Patrick Ben Koetter
 let me throw in some thoughts just to annoy you ;)
 Like with most statistical data I mostly see the figures being used to give
 statements on quantity - top poster, number of threads etc. Do you think it
 would be possible to also make some statements on quality?
 Let me give an example: Mailing lists are often places where people go to ask
 for advice. Someone asking usually starts a thread and continually keeps
 replying. That easily makes a person top poster and might make the same person
 a thread starter, but number of posts and threads started gives no indication
 of that persons knowledge (concerning the mailing lists topic).
 OTOH someone who has been on the list for ages, who replies more often than
 starting threads and who ends threads often after she has replied might very
 well be a very knowledgeable person, because she gives the one answer that
 solves the problem.
 Do you think it would be possible to deduct such quality oriented statements?

As a follow-up: I just stumbled across, which is nice because it also
gives an overview over all (here: some) mailing lists an identity posts to.

The second pie chart seems to try to say something about quality. It splits
posts in 'relevant' and 'passive', which are not exactly opposites, but well …

Actually I'd say they still need to work on their rating: ;)


state of mind ()
Digitale Kommunikation

Franziskanerstraße 15  Telefon +49 89 3090 4664
81669 München  Telefax +49 89 3090 4666

Amtsgericht MünchenPartnerschaftsregister PR 563

Mailman-Developers mailing list
Mailman FAQ:
Searchable Archives:

Security Policy:

Re: [Mailman-Developers] [GSoC 2012] Metrics

2012-05-20 Thread George Chatzisofroniou
Hello Stephen,

On Sat, May 19, 2012 at 09:12:03PM +0900, Stephen J. Turnbull wrote:
 I don't see why.  I would think quality metrics would be usefully
 presented via the same application as quantity metrics.  It would be
 interesting to correlate quality and quantity, for example.

What i was trying to say is that the quality metrics need some designing on a 
posts rating system first.

Of course they will be presented by my app.
 Do you mean to say this is out of scope of my project?  As much as
 I'd like to see quality metrics provided, I'd have to agree with you
 that it's out of scope of your project (maybe you could do one or more
 quality metrics on a time-permitting basis at the end of the period).

Yes, i think it is out of my GSoC project, but i would like to implement this 
after finish with quantity metrics this summer.
Mailman-Developers mailing list
Mailman FAQ:
Searchable Archives:

Security Policy:

Re: [Mailman-Developers] [GSoC 2012] Metrics

2012-05-19 Thread Stephen J. Turnbull
George Chatzisofroniou writes:

  This model represents an author of the mailing list. It mostly keeps
  track of the number of postings and number of threads started.  It has
  the following fields:
  - authorid – IntegerField

AFAIK every Django object has an internal ID.  Why do authors need a
separate, human-unfriendly authorid?

  - authormail – CharField

Authors are people.  They typically have nameswink/ and often
multiple email addresses.  There may also be other information
(organization, etc) that is available from the headers.

  - totalmails – IntegerField
  - totalthreads – IntegerField
  - firstmsgdate – DateTimeField
  - lastmsgdate – DateTimeField

  This model  counts  the total number of postings and threads started.
  - totalmails – IntegerField
  - totalthreads – IntegerField


  To display the metrics the Django template system will be used. To
  output the charts i will create some custom tags. The three following
  views will be used:
  - General page – On top, there will be general metrics about total
  authors, total mails and total threads and below three charts (AJAX

AJAX based doesn't belong in the spec; it's an implementation detail.

  that represent number of posts per author, number of threads
  per author and mailing list’s yearly usage. Even below there will be a
  number of charts (equal to the number of years of list’s existence)
  that output monthly usage.

Why multiple charts?  If you can afford a 640x480 chart area, with 4
pixel wide bars you can have 160 months  13 years in one chart.  I
personally wouldn't hesitate to go to pixel width bars, which gives
you  53 years.  I don't think people will be looking at charts for
precision, but rather to get an overview.

  At the end, there will be tabular data representing the authors
  - Author page – Each user will have his own page with his own metrics.
  Django Admin page – A ‘Generate’ button will be added to the Django admin 
  The Django app should handle the following configuration parameters:
  - Host – Message store data host
  - Port – Message store data port
  - Masking – A multi-state variable (None, abbreviated, full) for
  masking email addresses at the results (we don’t want the emails to be

(1) If at all possible, this should be inherited from the list
configuration (DRY).  It's not useful if the addresses are
available from the archives or by subscribing to the list.
(Actually, a really sophisticated spammer might want to attack by
spoofing frequent posters on the assumption they're more trusted
and more read, but that seems second-order to me.)
(2) It would be preferable if authors could supply nicknames, full
names, or avatars for this purpose.

  Interface to the Mailman core
  - Metrics class 
  - Generate class

Mailman-Developers mailing list
Mailman FAQ:
Searchable Archives:

Security Policy:

Re: [Mailman-Developers] [GSoC 2012] Metrics

2012-05-19 Thread Stephen J. Turnbull
George Chatzisofroniou writes:
  Hello Patrick,
  On Fri, May 18, 2012 at 12:09 AM, Patrick Ben Koetter wrote:
   Like with most statistical data I mostly see the figures being
   used to give statements on quantity - top poster, number of
   threads etc. Do you think it would be possible to also make some
   statements on quality?


For example, I think this would be really useful for class discussion
lists and the like (on the theory that the best way to learn a subject
is to teach it to others).

  The metrics will primarily extract the activity of a mailing list
  and its users.
  It is possible to emphasize on the quality of the posts but i think
  this is a different app

I don't see why.  I would think quality metrics would be usefully
presented via the same application as quantity metrics.  It would be
interesting to correlate quality and quantity, for example.

Do you mean to say this is out of scope of my project?  As much as
I'd like to see quality metrics provided, I'd have to agree with you
that it's out of scope of your project (maybe you could do one or more
quality metrics on a time-permitting basis at the end of the period).

Mailman-Developers mailing list
Mailman FAQ:
Searchable Archives:

Security Policy:

Re: [Mailman-Developers] [GSoC 2012] Metrics

2012-05-19 Thread Richard Wackerbarth

On May 19, 2012, at 7:12 AM, Stephen J. Turnbull wrote:

 George Chatzisofroniou writes:
 Hello Patrick,
 On Fri, May 18, 2012 at 12:09 AM, Patrick Ben Koetter wrote:
 Like with most statistical data I mostly see the figures being
 used to give statements on quantity - top poster, number of
 threads etc. Do you think it would be possible to also make some
 statements on quality?
 For example, I think this would be really useful for class discussion
 lists and the like (on the theory that the best way to learn a subject
 is to teach it to others).
 The metrics will primarily extract the activity of a mailing list
 and its users.
 It is possible to emphasize on the quality of the posts but i think
 this is a different app
 I don't see why.  I would think quality metrics would be usefully
 presented via the same application as quantity metrics.  It would be
 interesting to correlate quality and quantity, for example.
 Do you mean to say this is out of scope of my project?  As much as
 I'd like to see quality metrics provided, I'd have to agree with you
 that it's out of scope of your project (maybe you could do one or more
 quality metrics on a time-permitting basis at the end of the period).

I agree that he should think in terms of having additional quality-based 
metrics and presentations.
This ties directly into some of the things that the hyperkitty guys are doing.
We need to develop a single approach to the collection, storage, and display of 
this information.
However, other than considering how the hooks would work, I think that this 
is beyond the scope of this phase of development. Implementation should be 
considered to be part of a different contract.


Mailman-Developers mailing list
Mailman FAQ:
Searchable Archives:

Security Policy:

Re: [Mailman-Developers] [GSoC 2012] Metrics

2012-05-19 Thread Richard Wackerbarth
On May 19, 2012, at 6:58 AM, Stephen J. Turnbull wrote:

 George Chatzisofroniou writes:
 This model represents an author of the mailing list. It mostly keeps
 track of the number of postings and number of threads started.  It has
 the following fields:
 - authorid – IntegerField
 AFAIK every Django object has an internal ID.  Why do authors need a
 separate, human-unfriendly authorid?

Since George will not own the information about the author, this is his 
foreign key link into that data.

 - authormail – CharField
 Authors are people.  They typically have nameswink/ and often
 multiple email addresses.  There may also be other information
 (organization, etc) that is available from the headers.

I think that we should remove ALL of the author information from the MM core 
and create a separate service to collect and manage it. The mail handling core 
can subscribe to this service for the little necessary information that it 
requires about the persons.

In the real world, the relationship between the organization and the people 
subscribed to a mailing list often is not centered on the mailing list. They 
are customers, employees, participants, or such. From the POV of the mailing 
list, other than authentication of posting/subscription status, those details 
are not important. There is no reason why the mailing list handler should be 
the authority/repository for some, but not all, of the information about these 

 - totalmails – IntegerField
 - totalthreads – IntegerField
 - firstmsgdate – DateTimeField
 - lastmsgdate – DateTimeField

 This model  counts  the total number of postings and threads started.
 - totalmails – IntegerField
 - totalthreads – IntegerField

 To display the metrics the Django template system will be used. To
 output the charts i will create some custom tags. The three following
 views will be used:
 - General page – On top, there will be general metrics about total
 authors, total mails and total threads and below three charts (AJAX
 AJAX based doesn't belong in the spec; it's an implementation detail.

Agreed. George is still working on understanding the abstraction distinction.
He still wants to expose things that should remain under the hood. He should 
use more black paint.

 that represent number of posts per author, number of threads
 per author and mailing list’s yearly usage. Even below there will be a
 number of charts (equal to the number of years of list’s existence)
 that output monthly usage.
 Why multiple charts?  If you can afford a 640x480 chart area, with 4
 pixel wide bars you can have 160 months  13 years in one chart.  I
 personally wouldn't hesitate to go to pixel width bars, which gives
 you  53 years.  I don't think people will be looking at charts for
 precision, but rather to get an overview.

I would hope that he creates a chart widget (Django custom template tag) that 
will allow the site designer to choose the level of detail and duration covered 
by a particular instance.

 At the end, there will be tabular data representing the authors
 - Author page – Each user will have his own page with his own metrics.
 Django Admin page – A ‘Generate’ button will be added to the Django admin 
 The Django app should handle the following configuration parameters:
 - Host – Message store data host
 - Port – Message store data port
 - Masking – A multi-state variable (None, abbreviated, full) for
 masking email addresses at the results (we don’t want the emails to be
 (1) If at all possible, this should be inherited from the list
configuration (DRY).  It's not useful if the addresses are
available from the archives or by subscribing to the list.
(Actually, a really sophisticated spammer might want to attack by
spoofing frequent posters on the assumption they're more trusted
and more read, but that seems second-order to me.)
 (2) It would be preferable if authors could supply nicknames, full
names, or avatars for this purpose.

Agreed. This needs to be a part of the person data store.

 Interface to the Mailman core
 - Metrics class 
 - Generate class

Mailman-Developers mailing list
Mailman FAQ:
Searchable Archives:

Security Policy:

Re: [Mailman-Developers] [GSoC 2012] Metrics

2012-05-19 Thread Barry Warsaw
On May 19, 2012, at 08:18 AM, Richard Wackerbarth wrote:

I think that we should remove ALL of the author information from the MM core
and create a separate service to collect and manage it. The mail handling
core can subscribe to this service for the little necessary information that
it requires about the persons.

This should be possible with today's Mailman 3, though it might not be obvious
(and certainly isn't tested ;).  To do this, you'd implement the IUserManager
interface with whatever external-service-consulting implementation you'd like
to come up with.  Then you'd associate that component implementation with the
utility via the zope.configuration file.

Once you've done this, Mailman will always get its users and addresses from
the your separate service.  Come to think of it though, you probably also need
to re-implement the various IRoster implementations as well.

It might take some fiddling and experimentation, but the architecture intends
to make this kind of thing possible.

Mailman-Developers mailing list
Mailman FAQ:
Searchable Archives:

Security Policy:

Re: [Mailman-Developers] [GSoC 2012] Metrics

2012-05-18 Thread George Chatzisofroniou
Hello Patrick,

On Fri, May 18, 2012 at 12:09 AM, Patrick Ben Koetter wrote:
 Like with most statistical data I mostly see the figures being used to give
 statements on quantity - top poster, number of threads etc. Do you think it
 would be possible to also make some statements on quality?

The metrics will primarily extract the activity of a mailing list and its users.

It is possible to emphasize on the quality of the posts but i think
this is a different app (but it interacts with the Metrics one). If
the users were able to rank the posts through the archiver it wouldn't
be hard to indicate one user's offering to his community. This looks
like the first idea on this page [1].

 Do you also plan to deliver a tool that analyzes a mailing list archive in
 order to gather your statistical data? Having the statistical data might be a
 good reason for people to upgrade their MMx installation to MM3.

Yes, there will be a special Generate button in case of an existing
archive or in case of a system crash.


George Chatzisofroniou
Mailman-Developers mailing list
Mailman FAQ:
Searchable Archives:

Security Policy:

Re: [Mailman-Developers] [GSoC 2012] Metrics

2012-05-17 Thread George Chatzisofroniou
The following document is the lowest level of my design concept. You
may also read it in my blog [1]. Of course, comments are very welcome.


In order to store statistical data, the app will use some Django models:


This model represents an author of the mailing list. It mostly keeps
track of the number of postings and number of threads started.  It has
the following fields:

- authorid – IntegerField
- authormail – CharField
- totalmails – IntegerField
- totalthreads – IntegerField
- firstmsgdate – DateTimeField
- lastmsgdate – DateTimeField


This model  counts  the total number of postings and threads started.

- totalmails – IntegerField
- totalthreads – IntegerField


This model associates the author and the mailing list with each month.

- author – ForeignKey
- month – CharField
- postscount – IntegerField
- threadscount – IntegerField
- mailinglist – Boolean (if this is true it corresponds to the whole
mailing list)


This model is similar to month. It has a year field instead of a month field.


To display the metrics the Django template system will be used. To
output the charts i will create some custom tags. The three following
views will be used:

- General page – On top, there will be general metrics about total
authors, total mails and total threads and below three charts (AJAX
based) that represent number of posts per author, number of threads
per author and mailing list’s yearly usage. Even below there will be a
number of charts (equal to the number of years of list’s existence)
that output monthly usage. At the end, there will be tabular data
representing the authors of the mailing list along with their number
of posts, number of threads started and the date of their last post.
The user will be able to order the tabular data (alphabetically,
ascending/descending on number of posts, number of threads, date of
last message) by clicking on the table’s headings (Mail, Mails Sent,
Threads Started, Last Message).

- Author page – Each user will have his own page with his own metrics.
On top, there will be the email of the author, number of posts, number
of threads started and the dates of first and last message. Below
there will be monthly usage charts for each year the user is
subscribed to the mailing list.
Django Admin page – A ‘Generate’ button will be added to the Django admin page.

The Django app should handle the following configuration parameters:

- Host – Message store data host
- Port – Message store data port
- Masking – A multi-state variable (None, abbreviated, full) for
masking email addresses at the results (we don’t want the emails to be

Interface to the Mailman core

- Metrics class – When a new post is sent, the Metrics class will
receive it through the IArchiver interface. The Posts field of the
Mailing List model (as well as the the related rows on the Month and
Year models) will increase by one. If the author’s email is not in the
database, it will query the mailman core database with the email, grab
the author’s id and a new Author row will be created. Otherwise if the
author is already in the database, the Posts field and the two foreign
fields (Month and Year) will increase by one

- Generate class – When the ‘Generate’ button on the Admin page is pressed:
* The Django models will be initialized (the metrics will go back to
zero). A progress bar will inform the administrator that the operation
is being processed.
* All the messages of the archive will be parsed by performing a
direct Python call to the IArchiver. Another instance to the IArchiver
will grab any mails sent while the parsing is going on.
* The metrics will be generated from scratch.
* The administrator will be informed with a success message when the
process is over.


George Chatzisofroniou
Mailman-Developers mailing list
Mailman FAQ:
Searchable Archives:

Security Policy:

Re: [Mailman-Developers] [GSoC 2012] Metrics

2012-05-06 Thread George Chatzisofroniou
On Sat, May 5, 2012 at 11:26 PM, Barry Warsaw wrote:
 Do you know exactly what kind of information you need?

With my current implementation plan i only need author and date of the
message to export my metrics.

 The suggestion to use the IArchiver interface is convenient, but might not
 tell you much other than who posted what to which list on what date.  It won't
 tell you things like the number of recipients, how long that message took to
 deliver, if there were any failures, etc.

Some of the things you mentioned sound pretty interesting and it would
great to be on the metrics.

 It would certainly be possible to hook in a zope.event notification with those
 metrics for each successfully posted message.  From there, a plugin could
 register a subscriber that put the event data on a message bus.  Or you could
 grep the logs. :)

Hooking the zope.event notification sounds good.

 Anyway, I think it would be useful to improve the support for this in the

I'd like that!

I'll come up with another more-detailed report this week.

Thanks Barry!

George Chatzisofroniou
Mailman-Developers mailing list
Mailman FAQ:
Searchable Archives:

Security Policy:

Re: [Mailman-Developers] [GSoC 2012] Metrics

2012-05-05 Thread Stephen J. Turnbull
Terri Oda writes:

  So... If AJAX is the fastest way to get some initial prototypes going, 
  that's a good place to start.

As the author of the original suggestion about AJAXing the charts, I
don't actually think it is the easiest way.  I think it's easier to
just generate a fixed-size chart and slam it in as an IMG, and
generally good enough for the purpose.

The point of the suggestion was the coolness (IMHO) and robustness (to
window resizes and the like) of charts that expand to fit the width of
the window by adding data for more items (users, Months, whatever),
*without* asking the user to fill in a form to say how wide, just
resize the browser window.

It should be possible to do this with CSS or similar, but then you'd
need to generate a chart as wide as you think anyone could want and
serve that to everybody.

If you and George disagree with me on the coolness factor, great!
I'll be looking forward to some beautiful thigns I didn't imagine!
Mailman-Developers mailing list
Mailman FAQ:
Searchable Archives:

Security Policy:

Re: [Mailman-Developers] [GSoC 2012] Metrics

2012-05-05 Thread Florian Fuchs

Am 05.05.12 17:21, schrieb Stephen J. Turnbull:
 Terri Oda writes:
   So... If AJAX is the fastest way to get some initial prototypes going, 
   that's a good place to start.
 As the author of the original suggestion about AJAXing the charts, I
 don't actually think it is the easiest way.  I think it's easier to
 just generate a fixed-size chart and slam it in as an IMG, and
 generally good enough for the purpose.

Creating JS-/AJAX-ified graphs doesn't have to be hard. There are some
*really* awesome libraries out there (such as flot[1], Raphael[2] or
d3[3]) that make stuff like this pretty easy. (Those three are all MIT
licensed so I guess there should be no collision with the GPL?).

But I would consider carefully where to use or not to use actual AJAX
(meaning: asynchronous server requests). If there is a lot of data or if
the data depends on on-the-fly user inputs (like drill-down charts)
asynchronous requests are very useful. But charts can be fed from many
sources and AJAX is not always the best choice. For example, the
JavaScript application could read the data from a JSON string that is
delivered within the page content. Or it could use HTML5 data-
attributes. Or it could DOM-traverse an HTML table and build the chart
from its contents (that one could also make a nice bare-bones fallback
for non-JS browsers).

As for Geoff's non-JS request: I agree that it's perfectly reasonable to
ask for a non-JS version. But first let's let George be creative and
have some fun. If he comes up with a visualization that gives us
insights we would not get from a simple img or html table I surely
wouldn't want to miss that for the sake of general JS-freeness and 100%
accessibility. There's a reason modern data visualizations don't look
like good ol' analog[4] stats anymore... :-) That said, there are a lot
of ways to create beautiful fallback versions for non-JS (and even
text-based) browsers, and we should take care of that at *some* point
for sure.


[2], also gRaphael for analytics:

Mailman-Developers mailing list
Mailman FAQ:
Searchable Archives:

Security Policy:

Re: [Mailman-Developers] [GSoC 2012] Metrics

2012-05-05 Thread Barry Warsaw
On May 03, 2012, at 02:04 AM, George Chatzisofroniou wrote:

Interface to the MM core

The app needs to interact with the Mailman core. I think the best idea
is to implement a message bus that will send a notification every time
a message is sent (the same way an archiver works). Based on this
notification -which will carry the information about the sent message-
the app will be ready to update the counts.

Although, in some cases (eg the app is installed after an existing
archive or there was an unexpected crash), the message bus should
deliver (triggered by a button) more than one notifications (for the
whole archive) to the app in order to initialize/recover the metrics.
In those cases, the models will be initialized and the generation of
the metrics will start over.

Do you know exactly what kind of information you need?

The suggestion to use the IArchiver interface is convenient, but might not
tell you much other than who posted what to which list on what date.  It won't
tell you things like the number of recipients, how long that message took to
deliver, if there were any failures, etc.

The core does know this, and in fact logs it, but all you get at the IArchiver
interface is the posted message.

It would certainly be possible to hook in a zope.event notification with those
metrics for each successfully posted message.  From there, a plugin could
register a subscriber that put the event data on a message bus.  Or you could
grep the logs. :)

Anyway, I think it would be useful to improve the support for this in the


Mailman-Developers mailing list
Mailman FAQ:
Searchable Archives:

Security Policy:

Re: [Mailman-Developers] [GSoC 2012] Metrics

2012-05-04 Thread Stephen J. Turnbull
Richard Wackerbarth writes:

  I will take the blame for any misunderstanding in this area.

There's no blame to be assigned, really, unless to me for being a
busybody. :-)  If George is making these plans while consulting his
mentor, that's what this is all about (but he didn't say that, so I
stuck my nose in!)  And maybe I misinterpreted the word coding
which, as I continue to explore the GSoC documentation, seems to be a
catchall for GSoC work.

It's just been my experience in supervising economics and business
students' research (confirmed by a lot of the academic and practical
literature on managing software development) that people facing a
scheduling issue tend to skimp on planning, hoping to get lucky with a
first draft and as a backup, planning to revise during implementation.
This doesn't usually save time in the end, and often ends with a lower
quality product.

How George chooses to accomplish the planning, design, and coding, I'm
happy to leave up to him (and you).  I just wanted to warn against
hoping that producing a plan and design more quickly than he otherwise
would will save time.  Time taken liberally in the early stages will
conversely probably save time later.

Mailman-Developers mailing list
Mailman FAQ:
Searchable Archives:

Security Policy:

Re: [Mailman-Developers] [GSoC 2012] Metrics

2012-05-04 Thread Richard Wackerbarth

I am in complete agreement with your points about adequate planning.

However, I rarely see large projects that do enough research to plan things 
properly from the start.
Usually, a few people, who think that they understand the whole problem, make 
early design decisions that often become obstacles in the future. It is only 
after the prototype has been developed that others are able to point out 
weaknesses in the initial design.  As such, I advocate for a planned revise 
during implementation to the extent that you schedule a reimplementation for a 
new generation of the product rather than continually attempting to add on to 
the previous design. 


On May 4, 2012, at 4:36 AM, Stephen J. Turnbull wrote:

 Richard Wackerbarth writes:
 I will take the blame for any misunderstanding in this area.
 There's no blame to be assigned, really, unless to me for being a
 busybody. :-)  If George is making these plans while consulting his
 mentor, that's what this is all about (but he didn't say that, so I
 stuck my nose in!)  And maybe I misinterpreted the word coding
 which, as I continue to explore the GSoC documentation, seems to be a
 catchall for GSoC work.
 It's just been my experience in supervising economics and business
 students' research (confirmed by a lot of the academic and practical
 literature on managing software development) that people facing a
 scheduling issue tend to skimp on planning, hoping to get lucky with a
 first draft and as a backup, planning to revise during implementation.
 This doesn't usually save time in the end, and often ends with a lower
 quality product.
 How George chooses to accomplish the planning, design, and coding, I'm
 happy to leave up to him (and you).  I just wanted to warn against
 hoping that producing a plan and design more quickly than he otherwise
 would will save time.  Time taken liberally in the early stages will
 conversely probably save time later.

Mailman-Developers mailing list
Mailman FAQ:
Searchable Archives:

Security Policy:

Re: [Mailman-Developers] [GSoC 2012] Metrics

2012-05-04 Thread George Chatzisofroniou
Hello Stephen,

 On Thu, May 3, 2012 at 8:04 AM, George Chatzisofroniou wrote:

 I’m also thinking to rearrange the GSoC schedule a bit. I’ll start
 writing code on the bonding period,

 It's up to your coding mentors, but it's generally not a good idea to
 try to move up the coding stage by too much.  Make sure you have a
 clear spec before you start coding anything, and at least a rough
 sketch of a design.  Without those two pieces, there's no standard to
 evaluate progress, or whether your code is doing the right thing.

As wacky mentioned i'm not planning to skip any step of the process.
The idea is to press myself a bit more now, so i can have some more
time to study later. I should have explained myself better.

 so i can have more time during my
 university’s exams (starting on half of June).

 I'd say just take the time as needed, after negotiating with your mentors.


 The Django app should handle some configuration parameters, like:

 - Maximum number of the subscribers of a mailing list the user wants
 to be shown in the charts

 I don't know how feasible it would be to implement, but if you're
 willing to use AJAX, you could simply build up bar and line charts on
 the fly, adding slices of say 5 users at a time on the right side of
 the chart in progress.  That might be cool.  It could be set in a
 viewport and scrolled if it gets too big for the viewport.

Yes, AJAX can do the job here. I will implement the AJAX bar as you
described, although i think it's better to primarily have a
non-Javascript output.

 - Month – This model will store total posts and threads for each month
 - Year – Similar to month model, this one will store total posts and
 threads for each year

 It's not clear to me why year views can't be generated as an aggregate
 of monthly data?  This would allow years to start with arbitrary
 months without too much redundancy.

Generating the year views from monthly data is some more calculations
while displaying the metrics. That's why they should be stored

 Interface to the MM core

 The app needs to interact with the Mailman core. I think the best idea
 is to implement a message bus that will send a notification every time
 a message is sent (the same way an archiver works).

 Why not just use the iArchiver interface?

 I think it should be possible to have this be a standalone app serving
 the list itself, or be hosted by a particular archive, which wouldn't
 need to be the list host (or even run by the same organization).

Yes, i agree. The iArchiver should be the origin of the interface.

Thank you for your feedback,

George Chatzisofroniou
Mailman-Developers mailing list
Mailman FAQ:
Searchable Archives:

Security Policy:

Re: [Mailman-Developers] [GSoC 2012] Metrics

2012-05-04 Thread Stephen J. Turnbull
Richard Wackerbarth writes:

  Usually, a few people, who think that they understand the whole
  problem, make early design decisions that often become obstacles in
  the future. It is only after the prototype has been developed that
  others are able to point out weaknesses in the initial design.

I bow to your superior experience in the field (there are no large
projects needed in mine! ;-)

  As such, I advocate for a planned revise during implementation to
  the extent that you schedule a reimplementation for a new
  generation of the product rather than continually attempting to
  add on to the previous design.

I see.  So the idea is that that's the stage that George is at anyway?
He built one to throw away, and now it's time to progress to the
reimplementation?  (I forgot about that aspect; indeed, the fact that
he's already built a project similar to this one should speed up the
planning and design stage.)

Mailman-Developers mailing list
Mailman FAQ:
Searchable Archives:

Security Policy:

Re: [Mailman-Developers] [GSoC 2012] Metrics

2012-05-04 Thread Stephen J. Turnbull
George Chatzisofroniou writes:

   It's not clear to me why year views can't be generated as an aggregate
   of monthly data?  This would allow years to start with arbitrary
   months without too much redundancy.
  Generating the year views from monthly data is some more calculations
  while displaying the metrics. That's why they should be stored

OK, but these are pretty cheap calculations if I understand Django's
design correctly.

Which design do you think is simpler, in the context of your overall
Mailman-Developers mailing list
Mailman FAQ:
Searchable Archives:

Security Policy:

Re: [Mailman-Developers] [GSoC 2012] Metrics

2012-05-04 Thread George Chatzisofroniou
On Fri, May 4, 2012 at 4:06 PM, Stephen J. Turnbull wrote:
 OK, but these are pretty cheap calculations if I understand Django's
 design correctly.

Yes, these are cheap calculations. Although, the number of
calculations is based on the years of list's existence (e.g. a 10 year
list would need 10 calculations).

 Which design do you think is simpler, in the context of your overall

I think it's simpler to just add a year model.

I would like to hear more opinions on this issue.

George Chatzisofroniou
Mailman-Developers mailing list
Mailman FAQ:
Searchable Archives:

Security Policy:

Re: [Mailman-Developers] [GSoC 2012] Metrics

2012-05-04 Thread Geoff Shang

On Fri, 4 May 2012, George Chatzisofroniou wrote:

On Thu, May 3, 2012 at 8:04 AM, George Chatzisofroniou wrote:

I don't know how feasible it would be to implement, but if you're
willing to use AJAX, you could simply build up bar and line charts on
the fly, adding slices of say 5 users at a time on the right side of
the chart in progress.  That might be cool.  It could be set in a
viewport and scrolled if it gets too big for the viewport.

Yes, AJAX can do the job here. I will implement the AJAX bar as you
described, although i think it's better to primarily have a
non-Javascript output.

I would just like to restate my plea for the ability to manage Mailman 
without needing javascript.  Note that I'm not saying don't use any, I'm 
merely asking for it to be possible to use without it.

I use lynx as my primary browser, and while general web surfing can be a 
bit tricky these days, it's very quick and useful for doing admin-type 
things, and I for one would like to be able to continue to do so. 
Certainly I have no problems using Mailman 2'.1's interface with lynx, and 
have been doing so for many years.

If my memory serves me correctly, this was generally felt to be a 
reasonable request when I first mentioned it.  I'm sure it's in the 
archives somewhere.

Mailman-Developers mailing list
Mailman FAQ:
Searchable Archives:

Security Policy:

Re: [Mailman-Developers] [GSoC 2012] Metrics

2012-05-04 Thread George Chatzisofroniou
Hello Geoff,

On Sat, May 5, 2012 at 12:17 AM, Geoff Shang wrote:

 I would just like to restate my plea for the ability to manage Mailman
 without needing javascript.  Note that I'm not saying don't use any, I'm
 merely asking for it to be possible to use without it.

 I use lynx as my primary browser, and while general web surfing can be a bit
 tricky these days, it's very quick and useful for doing admin-type things,
 and I for one would like to be able to continue to do so. Certainly I have
 no problems using Mailman 2'.1's interface with lynx, and have been doing so
 for many years.

 If my memory serves me correctly, this was generally felt to be a reasonable
 request when I first mentioned it.  I'm sure it's in the archives somewhere.

Yes, i also agree with you. That's why i mentioned a primarily
non-Javascript output.

George Chatzisofroniou
Mailman-Developers mailing list
Mailman FAQ:
Searchable Archives:

Security Policy:

Re: [Mailman-Developers] [GSoC 2012] Metrics

2012-05-04 Thread Terri Oda

On 12-05-04 3:17 PM, Geoff Shang wrote:
I would just like to restate my plea for the ability to manage Mailman 
without needing javascript.  Note that I'm not saying don't use any, 
I'm merely asking for it to be possible to use without it.

This is totally a reasonable request, but I'd like to point out that 
what George is doing is not really on the management path.  If people 
without JavaScript can't see his graph prototypes ... oh well.  Just put 
up a note saying JavaScript is required for metrics.  Not a big deal as 
long as it doesn't break other functionality.

So... If AJAX is the fastest way to get some initial prototypes going, 
that's a good place to start.  If you want to do something else, go for 
that!  But with the compressed GSoC schedule, I want to make sure that 
George spends most of his time working on on the 90% of the project 
that's faster and fun and interesting, rather than spending inordinate 
amounts of time in that 10% that involves the harder finicky bits like 
testing in IE4.

It's always a good idea to discuss this stuff and to consider backwards 
compatibility, but as an experienced GSoC mentor I've seen a few GSoC 
projects fall off the rails due to starting with the 10% and never 
reaching the 90%.  There just isn't enough time in the 12 weeks.  So as 
one of George's mentors I'm going to encourage him to focus on making 
beautiful things.  No fiddly bits 'till after midterms at the earliest, 
and it is a-ok to just leave some of that stuff for the rest of the dev 
team to figure out later.


Mailman-Developers mailing list
Mailman FAQ:
Searchable Archives:

Security Policy:

Re: [Mailman-Developers] [GSoC 2012] Metrics

2012-05-03 Thread Stephen J. Turnbull
Hi George,

Thanks for the report!  A couple of quick comments:

On Thu, May 3, 2012 at 8:04 AM, George Chatzisofroniou wrote:

 I’m also thinking to rearrange the GSoC schedule a bit. I’ll start
 writing code on the bonding period,

It's up to your coding mentors, but it's generally not a good idea to
try to move up the coding stage by too much.  Make sure you have a
clear spec before you start coding anything, and at least a rough
sketch of a design.  Without those two pieces, there's no standard to
evaluate progress, or whether your code is doing the right thing.

 so i can have more time during my
 university’s exams (starting on half of June).

I'd say just take the time as needed, after negotiating with your mentors.


 The Django app should handle some configuration parameters, like:

 - Maximum number of the subscribers of a mailing list the user wants
 to be shown in the charts

I don't know how feasible it would be to implement, but if you're
willing to use AJAX, you could simply build up bar and line charts on
the fly, adding slices of say 5 users at a time on the right side of
the chart in progress.  That might be cool.  It could be set in a
viewport and scrolled if it gets too big for the viewport.

 - A multi-state variable (None, abbreviated, full) for masking email
 addresses at the results (we don’t want the emails to be spammed)

This is pointless if the archives show addresses.  So archives and
list should coordinate with the chart module on this.  I suppose this
should be a query to the list config (and maybe per user configs).

 - Month – This model will store total posts and threads for each month
 - Year – Similar to month model, this one will store total posts and
 threads for each year

It's not clear to me why year views can't be generated as an aggregate
of monthly data?  This would allow years to start with arbitrary
months without too much redundancy.

 Interface to the MM core

 The app needs to interact with the Mailman core. I think the best idea
 is to implement a message bus that will send a notification every time
 a message is sent (the same way an archiver works).

Why not just use the iArchiver interface?

I think it should be possible to have this be a standalone app serving
the list itself, or be hosted by a particular archive, which wouldn't
need to be the list host (or even run by the same organization).
Mailman-Developers mailing list
Mailman FAQ:
Searchable Archives:

Security Policy:

Re: [Mailman-Developers] [GSoC 2012] Metrics

2012-05-03 Thread Richard Wackerbarth

I will take the blame for any misunderstanding in this area.

The GSoC calendar is based on the assumption that the students will have 
completed their spring academic work prior to the serious coding phase.

Since conditions in Greece have delayed George's academic year, his spring 
final exams will now occur during the summer.
I suggested that, in order to meet his mid-term goals in a timely fashion, he 
might expend some extra hours now, hopefully be ready to start coding a little 
early, and then be able to take a planned period of reduced effort during his 

I am not, in any manner, suggesting that he skip any steps in the process.


 On Thu, May 3, 2012 at 8:04 AM, George Chatzisofroniou wrote:
 I’m also thinking to rearrange the GSoC schedule a bit. I’ll start
 writing code on the bonding period,
 It's up to your coding mentors, but it's generally not a good idea to
 try to move up the coding stage by too much.  Make sure you have a
 clear spec before you start coding anything, and at least a rough
 sketch of a design.  Without those two pieces, there's no standard to
 evaluate progress, or whether your code is doing the right thing.
 so i can have more time during my
 university’s exams (starting on half of June).
 I'd say just take the time as needed, after negotiating with your mentors.

Mailman-Developers mailing list
Mailman FAQ:
Searchable Archives:

Security Policy:

[Mailman-Developers] [GSoC 2012] Metrics

2012-05-02 Thread George Chatzisofroniou

I'm sending a report of my progress for the last two weeks (containing
an outline of my project).

The following report is also posted in my blog [1].

The last two weeks i’m into learning the Django web framework in more detail.

I’m also thinking to rearrange the GSoC schedule a bit. I’ll start
writing code on the bonding period, so i can have more time during my
university’s exams (starting on half of June).

Finally, i came up with an outline of my project. Here it is:


The Django app should handle some configuration parameters, like:

- Maximum number of the subscribers of a mailing list the user wants
to be shown in the charts
- A multi-state variable (None, abbreviated, full) for masking email
addresses at the results (we don’t want the emails to be spammed)

Apart from these, some parameters to establish a connection to the
message store (such as host or port) will be added.


In order to store statistical data, the app will use some Django models:

- Author – This model will store specific information for each
subscriber such as total posts, total threads started, date of last
message, subscription date, average of posts sent per day
- MailingList – General information about the mailing list will be
stored here such as total posts, total threads and two foreign keys
pointing the month and year models
- Month – This model will store total posts and threads for each month
- Year – Similar to month model, this one will store total posts and
threads for each year


To display the metrics some views will be used:

- A generic view for the mailing list and the top subscribers
- A more detailed page about the participation of a specific subscriber

The graphs will be generated using the PyChart library. (Custom tags
are the way that the app will embed the graphs)

Interface to the MM core

The app needs to interact with the Mailman core. I think the best idea
is to implement a message bus that will send a notification every time
a message is sent (the same way an archiver works). Based on this
notification -which will carry the information about the sent message-
the app will be ready to update the counts.

Although, in some cases (eg the app is installed after an existing
archive or there was an unexpected crash), the message bus should
deliver (triggered by a button) more than one notifications (for the
whole archive) to the app in order to initialize/recover the metrics.
In those cases, the models will be initialized and the generation of
the metrics will start over.

Any comments are highly appreciated.


George Chatzisofroniou
Mailman-Developers mailing list
Mailman FAQ:
Searchable Archives:

Security Policy:

Re: [Mailman-Developers] [GSoC 2012] Candidate on 'Integration of (existing) search code into Mailman archives'

2012-04-05 Thread George Chatzisofroniou
2012/4/3 Terri Oda
 Hi George,

 Your MailmanStats project looks great and would totally fit with what we
 have in mind for stats, though I'm guessing the hyperkitty team has some
 much more extensive work in mind making use of post ratings, tags, etc.

 If you're putting together your proposal now, do feel free to mention both
 projects as sources of interest.  Since you already have the stats code
 available, it might be possible to toss the integration in there after doing
 some other work.  Normally I worry about students biting off more than they
 can chew, but given your prior experience with Mailman and the fact that you
 already have the basic code, you can make a case for being able to package
 up that code and contribute it in a week or two our of your summer if you're
 ready for a code review.


 PS -  For further advice regarding search projects, see my previous post to

Thanks for your respond Terri,

I thought about it quite a lot.

Eventually, I think it is better to implement only the Metric idea
(since i already have the base code) by integrating my software into
Mailman. My previous experience with Mailman and the fact i have done
some work already will make a more awesome result.

I'll send my proposal in the next hours. I'll appreciate any feedback.


George Chatzisofroniou
Mailman-Developers mailing list
Mailman FAQ:
Searchable Archives:

Security Policy:

Re: [Mailman-Developers] [GSoC 2012] Candidate on 'Integration of (existing) search code into Mailman archives'

2012-04-02 Thread Terri Oda

Hi George,

Your MailmanStats project looks great and would totally fit with what we 
have in mind for stats, though I'm guessing the hyperkitty team has some 
much more extensive work in mind making use of post ratings, tags, etc.

If you're putting together your proposal now, do feel free to mention 
both projects as sources of interest.  Since you already have the stats 
code available, it might be possible to toss the integration in there 
after doing some other work.  Normally I worry about students biting off 
more than they can chew, but given your prior experience with Mailman 
and the fact that you already have the basic code, you can make a case 
for being able to package up that code and contribute it in a week or 
two our of your summer if you're ready for a code review.


PS -  For further advice regarding search projects, see my previous post 
to mailman-developers.

On 03/26/2012 03:38 PM, George Chatzisofroniou wrote:

Hello Mailman Developers,

My name is George Chatzisofroniou, i'm 20 years old and i'm an
undergraduate student in the Department of Informatics at the
University of Piraeus (Greece).

Ι have really good previous experience with Mailman. This is because i
use it for managing mailing lists for almost three years.

I have also developed, with a friend of mine, MailmanStats [1], a
Python software that outputs statistics for a mailing list based on
Mailman. I think this implements the 'metric' idea in some way. I
would like to know your opinion about MailmanStats.

I'm sending this mail to inform you about my will to be part of
Mailman Development team starting by Google Summer of Code 2012. The
idea that excites me more is the 'Integration of (existing) search
code into Mailman archives'. I think it is better to be developed on
Mailman v3 rather than v2. I realize the significance of a feature
like this. Many times before, i've got through the archives to search
for a specific thread, so an addition like this would be great!

As another student mentioned this idea is kinda small for the whole
summer, so if there is time left i could integrate my MailmanStats [1]
software into Mailman and/or build CSS styles for the web UI.

Please tell me what you think. I'm also on IRC by the name sophron.



Mailman-Developers mailing list
Mailman FAQ:
Searchable Archives:

Security Policy:

Re: [Mailman-Developers] GSoC 2012 - NNTP archive access

2012-04-01 Thread Alexander Sulfrian

On Tue Mar 27 22:52:04 CEST 2012, Barry Warsaw wrote:

 On Mar 27, 2012, at 09:09 PM, Alexander Sulfrian wrote:

  What are the next steps you would propose. I unfortunately not up
  to date with the development of mailman 3. But I am a little bit
  familiar with the mailman 2 source code.

 MM3 will be a better platform to build something like the NNTP
 access on.  The question in my mind is whether this should be done
 as part of the various independent (but related) archiver projects,
 or whether it should be done as a separate archiver.

there is a second question connected with that: Should the messages
be kept in an additional storage for NNTP access or should the default
archiver be responsible for storage and should be extended with methods
for accessing specific messages?

 In mm3, there's an API for feeding posted messages to an IArchiver,
 but this is quite flexible.  I could imagine that something on the
 other end of this vended messages via NNTP instead of HTTP. 

This would be the scenario if implementing the NNTP access in a new
archiver, separated from the other.

 The one key difference is that you'd like to be able to post to the
 mailing list through NNTP, with probably some additional posting
 rules (e.g. if you're not a member, but we know you, or you've
 been approved for posting a few times, your message wouldn't get
 held for moderator approval).

If it should be possible to post messages over the NNTP transport,
that does not match the classic design of an archiver. I do not know,
whether there is an API to post messages, but eventually it would be
better to implement the NNTP archive as external module, that could
maybe even run on a separate server. 

 If I was doing this, I'd probably looks seriously at Twisted as the
 basis for implementing the NNTP side of things.  I haven't looked in
 quite a while, but at the time, it had great support for NNTP

Yes, twisted should be the right choice. There is a twisted module for
implementing a NNTPServer[1], but it is not very well documented. But
even if it is not working, it should not be hard to implement it. The
NNTP commands described in RFC3977[2] do not look very complicated.

Additional to that, there is also the question, whether it should be
possible to sync a few mailman server over the NNTP protocol. That
would be a possibility to do clustering for load balancing or
something like that.




Description: PGP signature
Mailman-Developers mailing list
Mailman FAQ:
Searchable Archives:

Security Policy:

Re: [Mailman-Developers] GSoC 2012 - NNTP archive access

2012-04-01 Thread Barry Warsaw
BTW, the NNTP queue runner has now been ported to Mailman 3.  You will need to
re-run bin/buildout though, to pick up the new dependency on the mock library.

On Apr 01, 2012, at 04:08 PM, Alexander Sulfrian wrote:

 MM3 will be a better platform to build something like the NNTP
 access on.  The question in my mind is whether this should be done
 as part of the various independent (but related) archiver projects,
 or whether it should be done as a separate archiver.

there is a second question connected with that: Should the messages
be kept in an additional storage for NNTP access or should the default
archiver be responsible for storage and should be extended with methods
for accessing specific messages?

This is a good, but larger question.  I've always thought that Mailman will
require a message store as defined in the IMessageStore interface.  What
might make sense is to have a single implementation that satisfies the
IArchiver and IMessageStore (and possibly other interfaces), but with a single
on-disk storage.  This could in fact be the thing that backs the prototype

 In mm3, there's an API for feeding posted messages to an IArchiver,
 but this is quite flexible.  I could imagine that something on the
 other end of this vended messages via NNTP instead of HTTP. 

This would be the scenario if implementing the NNTP access in a new
archiver, separated from the other.

With the above, you probably wouldn't need this except as you say, if it is a
separate archiver.

 The one key difference is that you'd like to be able to post to the
 mailing list through NNTP, with probably some additional posting
 rules (e.g. if you're not a member, but we know you, or you've
 been approved for posting a few times, your message wouldn't get
 held for moderator approval).

If it should be possible to post messages over the NNTP transport,
that does not match the classic design of an archiver. I do not know,
whether there is an API to post messages, but eventually it would be
better to implement the NNTP archive as external module, that could
maybe even run on a separate server. 

Yes, now that the NNTPRunner is functional, it should be possible to set this
up as posting to an NNTP service that a site could run, independent of

 If I was doing this, I'd probably looks seriously at Twisted as the
 basis for implementing the NNTP side of things.  I haven't looked in
 quite a while, but at the time, it had great support for NNTP

Yes, twisted should be the right choice. There is a twisted module for
implementing a NNTPServer[1], but it is not very well documented. But
even if it is not working, it should not be hard to implement it. The
NNTP commands described in RFC3977[2] do not look very complicated.

Additional to that, there is also the question, whether it should be
possible to sync a few mailman server over the NNTP protocol. That
would be a possibility to do clustering for load balancing or
something like that.

That's a pretty cool idea, actually.  Something fun to explore for 3.1


Description: PGP signature
Mailman-Developers mailing list
Mailman FAQ:
Searchable Archives:

Security Policy:

Re: [Mailman-Developers] GSoC 2012

2012-03-28 Thread Barry Warsaw
On Mar 24, 2012, at 03:19 AM, vikash agrawal wrote:

I am Vikash and very much interested in contributing to mailman and being a
GSoC student this year. So far, I have successfully installed mailman in my


I do have skills in Python 2.7 but as I am very new to mailman thus I am
looking for something small to hack and doable in this summer.  Also, the
idea page doesnot mention the skills required for the project so its somewhat
difficult for me to choose one. As a result I would like you to guide me over
the same .  I am willing to learn a lot this summer :-)

Fantastic!  This is the right place to ask questions.  Also many of us hang
out on IRC using the freenode channel #mailman.

Note that I've started to tag bugs in the tracker with 'easy' if I think they
are (but I could be wrong :).  So you could search
for the 'easy' and 'mailman3' tags to find things to get started with.


Mailman-Developers mailing list
Mailman FAQ:
Searchable Archives:

Security Policy:

[Mailman-Developers] GSoC 2012 - NNTP archive access

2012-03-27 Thread Alexander Sulfrian

I would like to participate at the google summer of code this year for
mailman. While reading through the ideas in the wiki, the NNTP archive
access look very interesting.

I took part in the gsoc last year for vlc, developing a c library for
accessing a sony minidisc player from linux. This year vlc
unfortunately got rejected from google.

What are the next steps you would propose. I unfortunately not up to
date with the development of mailman 3. But I am a little bit familiar
with the mailman 2 source code.


Description: PGP signature
Mailman-Developers mailing list
Mailman FAQ:
Searchable Archives:

Security Policy:

[Mailman-Developers] [GSoC 2012] Candidate on 'Integration of (existing) search code into Mailman archives'

2012-03-26 Thread George Chatzisofroniou
Hello Mailman Developers,

My name is George Chatzisofroniou, i'm 20 years old and i'm an
undergraduate student in the Department of Informatics at the
University of Piraeus (Greece).

Ι have really good previous experience with Mailman. This is because i
use it for managing mailing lists for almost three years.

I have also developed, with a friend of mine, MailmanStats [1], a
Python software that outputs statistics for a mailing list based on
Mailman. I think this implements the 'metric' idea in some way. I
would like to know your opinion about MailmanStats.

I'm sending this mail to inform you about my will to be part of
Mailman Development team starting by Google Summer of Code 2012. The
idea that excites me more is the 'Integration of (existing) search
code into Mailman archives'. I think it is better to be developed on
Mailman v3 rather than v2. I realize the significance of a feature
like this. Many times before, i've got through the archives to search
for a specific thread, so an addition like this would be great!

As another student mentioned this idea is kinda small for the whole
summer, so if there is time left i could integrate my MailmanStats [1]
software into Mailman and/or build CSS styles for the web UI.

Please tell me what you think. I'm also on IRC by the name sophron.



George Chatzisofroniou
Mailman-Developers mailing list
Mailman FAQ:
Searchable Archives:

Security Policy:

[Mailman-Developers] GSoC 2012

2012-03-23 Thread vikash agrawal
Hi Everyone,
I am Vikash and very much interested in contributing to mailman and being a
GSoC student this year. So far, I have successfully installed mailman in my
I do have skills in Python 2.7 but as I am very new to mailman thus I am
looking for something small to hack and doable in this summer.
Also, the idea page doesnot mention the skills required for the project so
its somewhat difficult for me to choose one. As a result I would like you
to guide me over the same .
I am willing to learn a lot this summer :-)
Vikash Agrawal
sent via HTC Sensation
Mailman-Developers mailing list
Mailman FAQ:
Searchable Archives:

Security Policy:

[Mailman-Developers] gsoc-2012

2012-03-10 Thread abdul rauf
I was going through ideas page of mailman for GSoC 2012. I would like
to work with mailman.
If there are more ideas i would like to see it. Currently i am 3rd
year of under-graduation at K.B.N.C.E Gulbarga.
Mailman-Developers mailing list
Mailman FAQ:
Searchable Archives:

Security Policy:

Re: [Mailman-Developers] GSoC 2012

2012-02-09 Thread Barry Warsaw
On Feb 06, 2012, at 02:19 PM, Terri Oda wrote:

So, GSoC applications have opened... so I've started a wiki page for us to
capture project ideas as we get them formed:

Thanks for taking this on Terri.

One of the things I'd like to see is getting a student to move forwards with
integrating BrowserID into Mailman, which some of us have discussed already.

Yep, that would be a great project.  I know there's some controversy around
BrowserID (already), but it's worth investigating.  OpenID is I suppose
another thing to look at.

I'd like to see some more work done on the UI as well, but just saying that
is nebulous.  What more specific targeted projects do we have?  I'd like to
see integration of the search code we have from a previous GSoC into the
archives, for example.  What else do we have that's feasible for a student
over a 4 month period?

I think we'll have a much better idea about specifics after the Pycon sprint.
I'd like that to be really focused on integrating the web ui with the core.

Another thing to look at (added to the wiki page) is Grackle, which is an
archiver framework that the Launchpad folks are working on.  MHonArc is
problematic for them, and clearly Pipermail is ancient.  Grackle's approach
again is to provide a REST API to an underlying archiver, so that the actual
rendering can happen separately from storage, threading, etc.  Plus, it's


Mailman-Developers mailing list
Mailman FAQ:
Searchable Archives:

Security Policy:

[Mailman-Developers] GSoC 2012

2012-02-06 Thread Terri Oda
So, GSoC applications have opened... so I've started a wiki page for us 
to capture project ideas as we get them formed:

One of the things I'd like to see is getting a student to move forwards 
with integrating BrowserID into Mailman, which some of us have discussed 

I'd like to see some more work done on the UI as well, but just saying 
that is nebulous.  What more specific targeted projects do we have?  I'd 
like to see integration of the search code we have from a previous GSoC 
into the archives, for example.  What else do we have that's feasible 
for a student over a 4 month period?


Mailman-Developers mailing list
Mailman FAQ:
Searchable Archives:

Security Policy: