On Sun, Nov 25, 2012 at 1:56 AM, Ezio Melotti <ezio.melo...@gmail.com>wrote:

> On Sun, Nov 25, 2012 at 12:24 AM, anatoly techtonik 
> <techto...@gmail.com>wrote:
>
>> On Sat, Nov 24, 2012 at 7:04 AM, Ezio Melotti <ezio.melo...@gmail.com>
>>  wrote:
>>
>>> Thanks for your work!
>>>
>>> I played a bit with the code tonight and used it to create a json that
>>> maps filename -> list of issues with a patch that affects that filename.
>>>
>>
>> Nice. Is it possible to add this lookup to the post commit hook script to
>> report about amount of patches available for files that had been committed?
>> It is much easier to review code that's already in your mind.
>>
>
> I'm not sure I understand what you are asking here.  Are you suggesting to
> add a mercurial hook that, once you commit/push something, suggests other
> issues with patches that affect the same file(s)?
>

Right. After you've pushed something, a short message from server on the
screen:

  Thanks for your contribution to the module XXX, YYY, ... .

  Please note that files that you've touched have X open patches
  on the bug tracker. It might be the best time to review some now.
  ...

This serves two purpose:
1. Encourages people to review patches or do something else about them
(split issues) to avoid languishing
2. Complete stdlib.json mapping (yes, I am lazy, and it will be more fun
for people to fill their missing modules themselves)

 This could be done, but I think it's better to make the data available in
> the tracker so that developers can search other issues themselves.
>

Tracker integration is the next logical step (because it requires more
effort). It is good to have both, because we can't enforce any process
other than people used to. There is a chance that new search capabilities
just won't be used.


>  I made a simple page to filter the results and uploaded it here for now:
>>> http://wolfprojects.altervista.org/issues.html
>>> It requires javascript and it's a bit slow (at least on my pc), but it
>>> allows you to enter a module name or path and it will list all the issues
>>> related to the files that match the search (regex search should also work).
>>> This is still a work in progress though.
>>> If you want you can find the json at
>>> http://wolfprojects.altervista.org/files.json
>>
>>
>> Good work. I've pulled all the changes, but now I am getting:
>>
>> Traceback (most recent call last):
>>   File "modstats.py", line 142, in <module>
>>      print('#%s: %s' % (issuen, issue['title']))
>>   File "C:\Python27\lib\encodings\cp437.py", line 12, in encode
>>     return codecs.charmap_encode(input,errors,encoding_map)
>> UnicodeEncodeError: 'charmap' codec can't encode character u'\u2019' in
>> position 65: character maps to <undefined>
>>
>
> That's probably due to the limitations of the windows console.
>

http://wiki.python.org/moin/PrintFails#Windows described the problem, but
proposes no acceptable solution. Setting environment variable beforehand is
like placing a mattress just before hitting the ground. Why it is not
possible to fix the situation from inside of script?


>> Path cleaning is a good thing. Good auto classification also needs some
>> rules that needs additional data:
>>   - detect full path from just filename
>>
>
> The cleanup function I wrote just removes extraneous things from the
> path.  It doesn't verify if the file exists in the Python codebase.
>
>
>>     - if path is unknown, analyse filename
>>       - if filename is unique in Python source tree, return it's path
>>       - if filename is not unique, compare parent path components
>> recursively
>>         - if not successful, try context match patch
>>         - if everything fails, choose the first one
>>           - if it fails also, maintain manual connection patch <---> file
>>   For that to work we need an index of Python source code directory tree.
>>
>
> I don't think all this is necessary.  Once we have the list of file names
> extracted from the patches, it's enough to search for keyword or module
> name to find all the related issues.
> For example if you try searching for 'json' on
> http://wolfprojects.altervista.org/issues.html you will find all the
> json-related files, including the ones in the Python package, the C
> acceleration module, the documentation, the tests, and even files like
> "doc\json.rst" that don't exist in the Python codebase or got renamed at
> some point.
>

Good user story. I've added C acceleration module to the stdlib.json
description. My goal is to make patch classification an automatic process.
In the long run there won't be any patches in tracker that are hard to
apply. So "doc\json.rst" will gone. There should be a place to list all
those unknown patches though, so that people can take and work on them.

 And add % of recognized patches.
>> With manual classification (triaging) it is possible to keep this
>> per-cent at 100.
>>
>
> Trying to establish a mapping between the patches and the actual files is
> both cumbersome (especially if requires manual classification) and might
> end up missing some of the patches if they specify an incorrect path, add
> new files, or affected files that got renamed or deleted.
>

Mapping is just a table with is a temporary association. It may be not
necessary if there will be a separate page with all kinds of incorrect
patches.


> I'm considering adding a way to search for modules to the tracker, in a
> way similar to what I did on
> http://wolfprojects.altervista.org/issues.html.  The tracker has direct
> access to the files and issues so analyzing the patches and keep the
> database updated as new patches are attached should be easier.  Once we
> have the equivalent of files.json (maybe in a db table), it's just a matter
> to add a search form.
>

Sounds good as a temporary solution. I'd still prefer automatic
classification rules, but considering the fact that people can not remove
their patches, a search form can be the only viable solution.
_______________________________________________
Tracker-discuss mailing list
Tracker-discuss@python.org
http://mail.python.org/mailman/listinfo/tracker-discuss

Reply via email to