Kent

The citation without the name is perfect (and this appears to be how most 
citation parsers work).  There are two issues in the test run:

1.  The parallel citation 422 U.S. 490, 499 n. 10, 95 S.Ct. 2197, 2205 n. 10, 
45 L.Ed.2d 343 (1975) is resolved as:

422 U.S. 490 (1975)
499 n. 10 (1975)
95 S.Ct. 2197 (1975)
2205 n. 10 (1975)
45 L.Ed.2d 343 (1975)

instead of as:

422 U.S. 490, 499 n. 10 (1975)
95 S.Ct. 2197, 2205 n. 10 (1975)
45 L.Ed.2d 343 (1975)

ie. parsing the second page references should pick up all alphanumeric chars 
between the commas.

2. It doesn't parse the last citation ie. 463 U.S. 29, 43, 103 S.Ct. 2856, 
2867, 77 L.Ed.2d 443 (1983).  I tested it on another sample text and it missed 
the last citation too.

Thanks!

Dinesh


 
From: Kent Johnson 
Sent: Tuesday, February 10, 2009 4:01 AM
To: Dinesh B Vadhia 
Cc: tutor@python.org 
Subject: Re: [Tutor] Picking up citations


On Mon, Feb 9, 2009 at 12:51 PM, Dinesh B Vadhia
<dineshbvad...@hotmail.com> wrote:
> Kent /Emmanuel
>
> Below are the results using the PLY parser and Regex versions on the
> attached 'sierra' data which I think covers the common formats.  Here are
> some 'fully unparsed" citations that were missed by the programs:
>
> Smith v. Wisconsin Dept. of Agriculture, 23 F.3d 1134, 1141 (7th Cir.1994)
>
> Indemnified Capital Investments, S.A. v. R.J. O'Brien & Assoc., Inc., 12
> F.3d 1406, 1409 (7th Cir.1993).
>
> Hunt v. Washington Apple Advertising Commn., 432 U.S. 333, 343, 97 S.Ct.
> 2434, 2441, 53 L.Ed.2d 383 (1977)
>
> Idaho Conservation League v. Mumma, 956 F.2d 1508, 1517-18 (9th Cir.1992)

A few issues here:
S.A. - this is hard, to allow this while filtering out sentences
R.J. O'Brien, etc. - Loosening up the rules for the second name can allow these
1517-18 - allow page ranges

The name issues are getting to be too much for me. Attached is a PLY
version that just pulls out the citation without the name; at one
point you indicated that would work for you.

Kent
_______________________________________________
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor

Reply via email to