http://www.anc.org/

... but, this suggests the data they collect is only for research and
education.

On 8/8/2012 10:31 AM, Jason Baldridge wrote:
> Sorry if I missed something along the way -- who did the annotation of the
> Wikipedia data?
>
> BTW, the OANC will soon come out with their 3.0 release of MASC (the
> Manually Annotated Sub-Corpus), with about 800k tokens of English text
> (multiple domains, including twitter, blogs, transcribed spoken, and more)
> labeled with several different levels of analysis, including chunks (noun
> and verb), entities, tokens, POS tags, sentence boundaries, and logical
> forms.
>
> http://www.americannationalcorpus.org/MASC/Home.html
>
> On Wed, Aug 8, 2012 at 2:47 AM, Jörn Kottmann <[email protected]> wrote:
>
>> On 08/08/2012 06:16 AM, Michael Schmitz wrote:
>>
>>> Hi, here are some models trained on Wikipedia data.  They have similar
>>> performance.  Is this useful?
>>>
>> Yes, people who do not have access to our MUC based training
>> data can just use the wiki data instead and combine it with their data.
>>
>> Thanks for sharing.
>>
>> Now all we need is a way to get label corrections from the community :-)
>>
>> Jörn
>>
>
>

Reply via email to