Re: licensecheck and debian/copyright

2009-12-11 Thread Mathieu Parent
Hi,



On Fri, Dec 11, 2009 at 5:34 AM, Charles Plessy ple...@debian.org wrote:
 Le Thu, Dec 10, 2009 at 01:56:20AM +, Dmitrijs Ledkovs a écrit :

 There isn't DEB-5 debian/copyright parser available. So this cannot be
 implemented in licensecheck yet.

 Dear Dmitrijs,

 Jon Dowland has published an example parser on this list
...

 On my side, I have started to work on a parser for the relaxed syntax I 
 propose
...
There is also a lintian bug with initial patch: #478930
[checks/copyright-file] check for new copyright format

Regards

Mathieu Parent


--
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Re: licensecheck and debian/copyright

2009-12-10 Thread Charles Plessy
Le Thu, Dec 10, 2009 at 01:56:20AM +, Dmitrijs Ledkovs a écrit :
 
 There isn't DEB-5 debian/copyright parser available. So this cannot be
 implemented in licensecheck yet.

Dear Dmitrijs,

Jon Dowland has published an example parser on this list
(http://lists.debian.org/msgid-search/20090913225846.gb16...@tchicaya.lan).
However, it is written in Python and is therefore of a little help for
licensecheck, written in Perl.

On my side, I have started to work on a parser for the relaxed syntax I propose
on my exprimental git branch of the DEP
(http://git.debian.org/?p=users/plessy/license-summary.git;a=blob_plain;f=dep5.mdwn).

In that case, it is as simple as:

 - Process paragraphs – separated by an empty line – one by one.
 - Collapse paragraphs in a hash where keys are field names, ignoring
   paragraphs that do not contain fields.

This results in an array of hashes, or in YAML dialect, a sequence of mappings.

$/ = undef;
my @paragraphs = split (/\n\n/, );   # Split on empty lines
my @parsed;
my $counter = 0;

foreach my $paragraph (@paragraphs) {
if (my $collapsed = collapse($paragraph)) { # Collapse each paragraph 
in a hash
$parsed[$counter++] = $collapsed;
}
}

sub collapse {
my $paragraph = shift;
my %hash;
my $current_field = 0;# Next line may still be part of 
the field content.
my @lines = split (/\n/, $paragraph);
foreach (@lines) {
if ( /^(\w+)\s*:\s*(.*)$/ ) {  # New fields terminate the previous 
one.
$current_field = $1;
$hash{$1} .= $2;
} elsif ( /^\s(.*)$/ ) {
$hash{$current_field} .= \n$1 if $current_field;
} else {
$current_field = 0; # Lack of indentation also terminate the 
field.
}
}
return \%hash if keys(%hash);
}

The above script still has bugs, but I hope it summarises how easy it could be
to write a parser if the DEP is constructed with this as a goal.


I originally proposed a syntax that is not the same as Debian control files,
but currently I am still dissatisfied even by my proposition. With whichever
format, it is easy to break the syntax, in particular by forgetting white space
for indentation, or the ‘space-dot’ escape sequence for the empty lines in the
‘Debian control’ syntax. From my frustrating experience when adding by hand the
contents of the artistic v2.0 license to the debian/copyright file from one of
the packages I maintain, I concluded that it can significantly impair the
adoption of DEP-5. So on this list or elsewhere, I think that there is still
some experimentation and concertation to do.

Have a nice day,

-- 
Charles Plessy
Tsurumi, Kanagawa, Japan


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Re: licensecheck and debian/copyright

2009-12-09 Thread Dmitrijs Ledkovs
2009/12/9 Jérémy Lal jeremy@m4x.org:
 Hi,
 is there a way to automatically remove from licensecheck ouput all
 the files already described in debian/copyright (when it's properly 
 formatted) ?

 Jérémy.


Doesn't know how to parse debian/copyright because it could be free-form.

There isn't DEB-5 debian/copyright parser available. So this cannot be
implemented in licensecheck yet.


--
With best regards


Dmitrijs Ledkovs (for short Dima),
Ледков Дмитрий Юрьевич

()  ascii ribbon campaign - against html e-mail
/\  www.asciiribbon.org   - against proprietary attachments


--
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org