Hi Joe,when analyzing the patch `<[email protected]>` [1] with `get_maintainers.pl --subsystem --status --separator , /tmp/patch`, there is the following output:
Chris Mason <[email protected]> (maintainer:BTRFS FILE SYSTEM),Josef Bacik <[email protected]> (maintainer:BTRFS FILE SYSTEM),David Sterba <[email protected]> (maintainer:BTRFS FILE SYSTEM),Alexander Viro <[email protected]> (maintainer:FILESYSTEMS (VFS and infrastructure)),"Theodore Ts'o" <[email protected]> (maintainer:EXT4 FILE SYSTEM),Andreas Dilger <[email protected]> (maintainer:EXT4 FILE SYSTEM),Jaegeuk Kim <[email protected]> (maintainer:F2FS FILE SYSTEM),Changman Lee <[email protected]> (maintainer:F2FS FILE SYSTEM),Miklos Szeredi <[email protected]> (maintainer:FUSE: FILESYSTEM IN USERSPACE),Steven Whitehouse <[email protected]> (supporter:GFS2 FILE SYSTEM),Anton Altaparmakov <[email protected]> (supporter:NTFS FILESYSTEM),Hugh Dickins <[email protected]> (maintainer:TMPFS (SHMEM FILESYSTEM)),[email protected] (open list:BTRFS FILE SYSTEM),[email protected] (open list),[email protected] (open list:FILESYSTEMS (VFS and infrastructure)),[email protected] (open list:EXT4 FILE SYSTEM),[email protected] (open list:F2FS FILE SYSTEM),[email protected] (open list:FUSE: FILESYSTEM IN USERSPACE),[email protected] (open list:GFS2 FILE SYSTEM),[email protected] (open list:NTFS FILESYSTEM),[email protected] (open list:MEMORY MANAGEMENT)
Maintained,Buried alive in reporters,Supported
BTRFS FILE SYSTEM,THE REST,FILESYSTEMS (VFS and infrastructure),EXT4
FILE SYSTEM,F2FS FILE SYSTEM,FUSE: FILESYSTEM IN USERSPACE,GFS2 FILE
SYSTEM,NTFS FILESYSTEM,MEMORY MANAGEMENT,TMPFS (SHMEM FILESYSTEM)
How can I parse this output automatically? or how can I generate a parsable output?
I need the tuples of subsystems and status: (THE REST, Buried alive in reporters) (TMPFS, Maintained) (BTRFS FILE SYSTEM, Maintained) … (GFS2 FILE SYSTEM, Supported) I'm not aware how to reliably assign the statuses to the subsystems. Thank you in advance Kind regards Sebastian Duda [1] https://lore.kernel.org/patchwork/patch/537252/ On 2019-07-19 10:50, Joe Perches wrote:
On Fri, 2019-07-19 at 07:35 +0000, Duda, Sebastian wrote:Hi Joe,I'm conducting a large-scale patch analysis of the LKML with 1.8 millionpatch emails. I'm using the `get_maintainer.pl` script to know which patch is related to which subsystem.The MAINTAINERS file is updated frequently. Are you also using the MAINTAINERS file used at the time each patch was submitted?I ran into two issues while using the script: 1. When I use the script the trivial way $ scripts/get_maintainer.pl --subsystem --status --separator , drivers/media/i2c/adv748x/Kieran Bingham <[email protected]> (maintainer:ANALOGDEVICES INC ADV748X DRIVER),Mauro Carvalho Chehab <[email protected]> (maintainer:MEDIA INPUT INFRASTRUCTURE (V4L/DVB)),[email protected] (open list:ANALOG DEVICES INC ADV748X DRIVER),[email protected] (open list) Maintained,Buried alive in reporters ANALOG DEVICES INC ADV748X DRIVER,MEDIA INPUT INFRASTRUCTURE (V4L/DVB),THE RESTthe output is hard to parse because the status `Maintained` is displayedonly once but related to two subsystems. I'd prefer a more table like representation, like this:Kieran Bingham <[email protected]> (maintainer:ANALOGDEVICES INC ADV748X DRIVER),[email protected] (open list:ANALOG DEVICES INC ADV748X DRIVER),ANALOG DEVICES INC ADV748X DRIVER,MaintainedMauro Carvalho Chehab <[email protected]> (maintainer:MEDIA INPUTINFRASTRUCTURE (V4L/DVB)),MEDIA INPUT INFRASTRUCTURE (V4L/DVB),Maintained [email protected] (open list),THE REST,Buried alive in reporters2. I want to analyze multiple patches, currently I am calling the scriptonce per patch. When calling the script with multiple files the files output is merged $ scripts/get_maintainer.pl --subsystem --status --separator ',' drivers/media/i2c/adv748x/ include/uapi/linux/wmi.hKieran Bingham <[email protected]> (maintainer:ANALOGDEVICES INC ADV748X DRIVER),Mauro Carvalho Chehab <[email protected]> (maintainer:MEDIA INPUT INFRASTRUCTURE (V4L/DVB)),[email protected] (open list:ANALOG DEVICES INC ADV748X DRIVER),[email protected] (open list),[email protected] (open list:ACPI WMI DRIVER) Maintained,Buried alive in reporters,Orphan ANALOG DEVICES INC ADV748X DRIVER,MEDIA INPUT INFRASTRUCTURE (V4L/DVB),THE REST,ACPI WMI DRIVER I'd like to run the script with all files but separated output, like this: $ scripts/get_maintainer.pl --subsystem --status --separator ',' --separate-files drivers/media/i2c/adv748x/ include/uapi/linux/wmi.hKieran Bingham <[email protected]> (maintainer:ANALOGDEVICES INC ADV748X DRIVER),Mauro Carvalho Chehab <[email protected]> (maintainer:MEDIA INPUT INFRASTRUCTURE (V4L/DVB)),[email protected] (open list:ANALOG DEVICES INC ADV748X DRIVER),[email protected] (open list) Maintained,Buried alive in reporters ANALOG DEVICES INC ADV748X DRIVER,MEDIA INPUT INFRASTRUCTURE (V4L/DVB),THE REST [email protected] (open list:ACPI WMI DRIVER),[email protected] (open list) Orphan,Buried alive in reporters ACPI WMI DRIVER,THE REST My Questions are: 1. How can I make get_maintainer's output to be more table-like?I suggest adding --nogit --nogit-fallback --roles --norolestats2. How can I make get_maintainer.pl to separate each file's output?Run the script with multiple invocations. once for each file modified by the patch.

