You might also find it helpful to look at apertium dictionary format, which
is also standard XML. Here is the link to svn for Nepalese Language (its the
closest language to Bengali in apertium we have so far, and the Bengali pair
is far from finished :( )
http://apertium.svn.sourceforge.net/svnroot/apertium/trunk/apertium-bn-en/.

I have been working to find some standard tag sets for Bengali language, so
far I'm also doing away with pen treebank tagsets, but I the future I might
need to extend those, as for my project requirements. *However, I bellive
penn treebank tagset to be sufficient for a general purpose dictionary
format.*

The attached file contains the Pen Treebank Tagset and also the bilingual
ductioanry format from apertium.

What I'd like to propose is instead of using <pos_tag>Verb, non-3rd person
singular present</
pos_tag> you could create some definitions like verb, person, number, tense
and then use them as the property for the specific entry. I'd be easier to
parse in the future.

On Wed, May 13, 2009 at 8:02 AM, Golam Mortuza Hossain
<gmhoss...@gmail.com>wrote:

> Hi,
>
> On Tue, May 12, 2009 at 5:13 PM, Salahuddin Pasha
> <salahuddi...@gmail.com> wrote:
> > Basic work is already done, but we need to define a standard XML (XML
> > DTD or XML Schema).
> > Example: test XML output.
> >
> > <?xml version="1.0" encoding="utf-8"?>
> > <dictionary>
> >       <search_results>
> >               <dict_entry id="1">
> >                       <en_word>read</en_word>
> >                       <pos_tag>Noun, singular or mass</pos_tag>
>
>
> Thanks a lot for your work.
>
> I should suggest that you also try to have an entry for PennTag
> for Parts-of-Speech (pos) like "NN", "VV" etc. So something like
>
> <penn_tag>NN</penn_tag>
>
> This would be needed if Anubadok Online intreface needs to update its
> database using your XML gateway of Ankur dictionary database.
>
> Cheers,
> Golam
>
>
> ------------------------------------------------------------------------------
> The NEW KODAK i700 Series Scanners deliver under ANY circumstances! Your
> production scanning environment may not be a perfect world - but thanks to
> Kodak, there's a perfect scanner to get the job done! With the NEW KODAK
> i700
> Series Scanner you'll get full speed at 300 dpi even with all image
> processing features enabled. http://p.sf.net/sfu/kodak-com
> _______________________________________________
> Bengalinux-core mailing list
> Bengalinux-core@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/bengalinux-core
>



-- 
Regards
Abu Zaher Md. Faridee

http://zaher14.blogspot.com/
---
Time heals every wound, but time itself is a wound that never heals.
------------------------------------------------------------------------------
The NEW KODAK i700 Series Scanners deliver under ANY circumstances! Your
production scanning environment may not be a perfect world - but thanks to
Kodak, there's a perfect scanner to get the job done! With the NEW KODAK i700
Series Scanner you'll get full speed at 300 dpi even with all image 
processing features enabled. http://p.sf.net/sfu/kodak-com
_______________________________________________
Bengalinux-core mailing list
Bengalinux-core@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bengalinux-core

Reply via email to