William, in my last project that I used doccat, I extended the documentsample 
and just added a generic Map to hold additional key values. Perhaps adding that 
to the baseline might be natural

Sent from my iPhone

> On Apr 15, 2014, at 11:45 AM, William Colen <william.co...@gmail.com> wrote:
> 
> Hello,
> 
> I've been working with the Doccat module and I am wondering if we could
> improve its data structure for the 1.6.0 release.
> 
> Today the DocumentSample has the following attributes:
> 
> - String category
> - List<String> text
> 
> I would suggest adding an attribute to hold metadata, or additional
> contexts information. What do you think?
> 
> Also, what do you think of including sentences and paragraph information? I
> don't know if there is anything a feature generator can extract from it to
> improve the classification.
> 
> Thank you,
> William

Reply via email to