On Fri, 2019-07-19 at 07:35 +0000, Duda, Sebastian wrote: > Hi Joe, > > I'm conducting a large-scale patch analysis of the LKML with 1.8 million > patch emails. I'm using the `get_maintainer.pl` script to know which > patch is related to which subsystem.
The MAINTAINERS file is updated frequently. Are you also using the MAINTAINERS file used at the time each patch was submitted? > I ran into two issues while using the script: > > 1. When I use the script the trivial way > > $ scripts/get_maintainer.pl --subsystem --status --separator , > drivers/media/i2c/adv748x/ > Kieran Bingham <[email protected]> (maintainer:ANALOG > DEVICES INC ADV748X DRIVER),Mauro Carvalho Chehab <[email protected]> > (maintainer:MEDIA INPUT INFRASTRUCTURE > (V4L/DVB)),[email protected] (open list:ANALOG DEVICES INC > ADV748X DRIVER),[email protected] (open list) > Maintained,Buried alive in reporters > ANALOG DEVICES INC ADV748X DRIVER,MEDIA INPUT INFRASTRUCTURE > (V4L/DVB),THE REST > > the output is hard to parse because the status `Maintained` is displayed > only once but related to two subsystems. > > I'd prefer a more table like representation, like this: > > Kieran Bingham <[email protected]> (maintainer:ANALOG > DEVICES INC ADV748X DRIVER),[email protected] (open > list:ANALOG DEVICES INC ADV748X DRIVER),ANALOG DEVICES INC ADV748X > DRIVER,Maintained > Mauro Carvalho Chehab <[email protected]> (maintainer:MEDIA INPUT > INFRASTRUCTURE (V4L/DVB)),MEDIA INPUT INFRASTRUCTURE > (V4L/DVB),Maintained > [email protected] (open list),THE REST,Buried alive in > reporters > > > 2. I want to analyze multiple patches, currently I am calling the script > once per patch. When calling the script with multiple files the files > output is merged > > $ scripts/get_maintainer.pl --subsystem --status --separator ',' > drivers/media/i2c/adv748x/ include/uapi/linux/wmi.h > Kieran Bingham <[email protected]> (maintainer:ANALOG > DEVICES INC ADV748X DRIVER),Mauro Carvalho Chehab <[email protected]> > (maintainer:MEDIA INPUT INFRASTRUCTURE > (V4L/DVB)),[email protected] (open list:ANALOG DEVICES INC > ADV748X DRIVER),[email protected] (open > list),[email protected] (open list:ACPI WMI DRIVER) > Maintained,Buried alive in reporters,Orphan > ANALOG DEVICES INC ADV748X DRIVER,MEDIA INPUT INFRASTRUCTURE > (V4L/DVB),THE REST,ACPI WMI DRIVER > > I'd like to run the script with all files but separated output, like > this: > > $ scripts/get_maintainer.pl --subsystem --status --separator ',' > --separate-files drivers/media/i2c/adv748x/ include/uapi/linux/wmi.h > Kieran Bingham <[email protected]> (maintainer:ANALOG > DEVICES INC ADV748X DRIVER),Mauro Carvalho Chehab <[email protected]> > (maintainer:MEDIA INPUT INFRASTRUCTURE > (V4L/DVB)),[email protected] (open list:ANALOG DEVICES INC > ADV748X DRIVER),[email protected] (open list) > Maintained,Buried alive in reporters > ANALOG DEVICES INC ADV748X DRIVER,MEDIA INPUT INFRASTRUCTURE > (V4L/DVB),THE REST > > [email protected] (open list:ACPI WMI > DRIVER),[email protected] (open list) > Orphan,Buried alive in reporters > ACPI WMI DRIVER,THE REST > > > My Questions are: > 1. How can I make get_maintainer's output to be more table-like? I suggest adding --nogit --nogit-fallback --roles --norolestats > 2. How can I make get_maintainer.pl to separate each file's output? Run the script with multiple invocations. once for each file modified by the patch.

