Hi Matthias,

Thanks for the information.

I tested on moses 3.0, adding phrase table sparse feature is seems working.

However, I did not add any flag into ini, like suggested "If a phrase table
contains sparse features, then this needs to be flagged in the
configuration file by adding the word sparse after the phrase table file
name.". Did i miss anything?

Regards,

Jian




On Thu, Jul 16, 2015 at 3:23 AM, Matthias Huck <mh...@inf.ed.ac.uk> wrote:

> Hi Jian,
>
> That depends on the nature of the features you're planning to
> implement.
>
> In order to produce sparse features, you need to write a feature
> function anyway.
>
> But if it's only a handful of scores and they can be calculated during
> extraction time, then go for dense features and add the scores directly
> to the phrase table.
>
> If the scores cannot be precalculated, for instance because you need
> non-local information that is only available during decoding, then a
> feature function implementation becomes necessary.
>
> When you write a feature function that calculates scores during decoding
> time, it can produce dense scores, sparse scores, or both types. That's
> up to you.
>
> If it's plenty of scores which are fired rarely, then sparse is the
> right choice. And you certainly need a sparse feature function
> implementation in case you are not aware in advance of the overall
> amount of feature scores it can produce.
>
> If you need information from phrase extraction in order to calculate
> scores during decoding time, then we have something denoted as "phrase
> properties". Phrase properties give you a means of storing arbitrary
> additional information in the phrase table. You have to extend the
> extraction pipeline to retrieve and store the phrase properties you
> require. The decoder can later read this information from the phrase
> table, and your feature function can utilize it in some way.
>
> A large amount of sparse feature scores can somewhat slow down decoding
> and tuning. Also, you have to use MIRA or PRO for tuning, not MERT.
>
> Cheers,
> Matthias
>
>
> On Thu, 2015-07-16 at 02:18 +0100, jian zhang wrote:
> > Hi Matthias,
> >
> >
> > Not for domain feature.
> >
> >
> > I want to implement some sparse features, so there are two options:
> > 1, add to phrase table, if it is supported
> > 2, implement sparse feature functions,
> >
> >
> > I'd like to know are there any difference between these two options,
> > for example, tuning, compute sentence translation scores ...
> >
> >
> > Regards,
> >
> >
> >
> > Jian
> >
> >
> >
> > On Thu, Jul 16, 2015 at 2:06 AM, Matthias Huck <mh...@inf.ed.ac.uk>
> > wrote:
> >         Hi,
> >
> >         Are you planning to use binary domain indicator features? I'm
> >         not sure
> >         whether a sparse feature function for this is currently
> >         implemented. If
> >         you're working with a small set of domains, you can employ
> >         dense
> >         indicators instead (domain-features = "indicator" in EMS).
> >         You'll have
> >         to re-extract the phrase table, though. Or process it with a
> >         script to
> >         add dense indicator values to the scores field.
> >
> >         I believe that there might also be some bug in the extraction
> >         pipeline
> >         when both domain-features = "sparse indicator" and
> >         score-settings =
> >         "--GoodTuring" are active in EMS. At least it caused me
> >         trouble a couple
> >         of weeks ago. However, I must admit that I didn't investigate
> >         it further
> >         at that point.
> >
> >         Anyway, the bottom line is that I recommend re-extracting with
> >         dense
> >         indicators.
> >
> >         But let me know what you find regarding a sparse
> >         implementation.
> >
> >         Cheers,
> >         Matthias
> >
> >
> >         On Thu, 2015-07-16 at 00:48 +0100, jian zhang wrote:
> >         > Hi,
> >         >
> >         >
> >         > Is the sparse features at phrase table, like
> >         >
> >         >
> >         >
> >         > das Haus ||| the house ||| 0.8 0.5 0.8 0.5 2.718 ||| 0-0 1-1
> >         ||| 5000
> >         > 5000 2500 ||| dom_europarl 1
> >         >
> >         >
> >         >
> >         > still supported? If yes, what should I set to the ini file
> >         based on
> >         > the example above?
> >         >
> >         >
> >         > Thank,
> >         >
> >         >
> >         > Jian
> >         >
> >         >
> >         > --
> >         > Jian Zhang
> >         > Centre for Next Generation Localisation (CNGL)
> >         > Dublin City University
> >
> >         > _______________________________________________
> >         > Moses-support mailing list
> >         > Moses-support@mit.edu
> >         > http://mailman.mit.edu/mailman/listinfo/moses-support
> >
> >
> >
> >         --
> >         The University of Edinburgh is a charitable body, registered
> >         in
> >         Scotland, with registration number SC005336.
> >
> >
> >
> >
> >
> > --
> > Jian Zhang
> > Centre for Next Generation Localisation (CNGL)
> > Dublin City University
>
>
>
> --
> The University of Edinburgh is a charitable body, registered in
> Scotland, with registration number SC005336.
>
>


-- 
Jian Zhang
Centre for Next Generation Localisation (CNGL)
<http://www.cngl.ie/index.html>
Dublin City University <http://www.dcu.ie/>
_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to