[Moses-support] build a demonstration system

2013-09-02 Thread 서원영
Title: Samsung Enterprise Portal mySingle


Hi,
I am just a beginner in Moses and I am trying to build a demonstration system by using mosesserver.cpp.
After I tested server program, I've noticed that there are no tokenizing and truecasing step in mosesserver.cpp.
So, Is there anybody who met a same problem with me?
I am thinking about inserting those steps in mosesserver.cpp.
I wish I could hear about this idea from other people.
 
Thanks and Best regards, 
 EngineerWonyoung Seo   
Web Platform Lab / Software R&D Center / SAMSUNG ELECTRONICS CO,. LTD TEL/Mobile : +82-10-9626-3686 / E-mail : roarman@samsung.com
 




___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] Check scores.size() == indexes.second - indexes.first failed

2013-09-02 Thread Herry Sujaini
Saya juga menggunakan cara pada link tsb.
Kalau ada masalah coba tanyakan di mailinglist nya, biasanya tidak sampai 24 
jam sudah ada yang akan membantu
http://mailman.mit.edu/mailman/listinfo/moses-support
 





 From: Hieu Hoang 
To: moses-support@mit.edu 
Sent: Saturday, 31 August 2013, 0:56
Subject: Re: [Moses-support] Check scores.size() == indexes.second - 
indexes.first failed
 

can you send me your moses.ini file and the tuning stderr file

ta

On 30/08/2013 18:50, Eleftherios Avramidis wrote:
> Hi all,
>
> the latest development version of the decoder (checked out yesterday)
> fails to run with 3 kenlm language models during tuning. I am not sure
> that this is the issue.
>
> Translating line 0  i
> Translating line 0  in thread id 140521597654784
> reading bin ttable
> size of OFF_T 8
> binary phrasefile loaded, default OFF_T: -1
> binary file loaded, default OFF_T: -1
> Check scores.size() == indexes.second - indexes.first failed in 
> ./moses/ScoreComponentCollection.h:235
> Aborted (core dumped)
> Exit code: 134
> The decoder died. CONFIG WAS -weight-overwrite 'LM0= 0.089286 LM1= 0.089286 
> WordPenalty0= -0.178571 PhrasePenalty0= 0.035714 LM2= 0.089286 
> TranslationModel0= 0.035714 0.035714 0.035714 0.035714 Distortion0= 0.053571 
> LexicalReordering0= 0.053571 0.053571 0.053571 0.053571 0.053571 0.053571'
> cp: cannot stat 
> `/share/taraxu/systems/bmmt/20130812_MAN/tuning/tmp.6/moses.ini': No such 
> file or directory
>
> best
> Lefteris
>
>
>

___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


[Moses-support] EAMT summer internships 2014

2013-09-02 Thread Mikel Forcada


 EAMT summer internships 2014

The European Association for Machine Translation (EAMT, 
http://www.eamt.org) is an organization that serves the growing 
community of people interested in MT and translation tools, including 
users, developers, and researchers of this increasingly viable technology.
As part of its commitment to promote research, development and awareness 
about translation technologies, the EAMT is for the first time launching 
a call for summer internships.
Institutions or companies hosting the summer internships are expected to 
apply, manage the funding, select the student and document the internship.
Students wishing to visit a place may want to convince a host to apply 
for funding and help with proposal writing, but, in the end, the hosting 
institution or company will be responsible.
Summer internships are expected to occur between June 15, 2014 and 
September 15, 2014.



   Purpose of the Call

The EAMT is planning to support summer internships in the field of MT 
and translation tools.
The subject of the internship should be of direct interest to the 
community of researchers, developers, vendors and users of MT and 
translation technologies.

Topics of interest include, but are not limited to:

 * Recent developments in MT research.
 * MT evaluation methodology, metrics and results.
 * Launch of MT-specific evaluation campaigns.
 * New or prospective commercial uses of MT technology.
 * MT environments (workflow, support tools, etc.).
 * Interaction between users and MT systems.
 * MT combined with other technologies (translation memories, speech
   translation, cross-language information retrieval, multilingual text
   categorization, multilingual text summarization, etc.).
 * MT for less-resourced languages: development, usage, etc.
 * MT in the social Internet: new uses, new modes of development.

All proposals will be screened by a review committee that consists of 
EAMT Executive Committee members and possibly a few appointed external 
experts if necessary.



   Submission information


 Eligibility requirements

In order to qualify for funding, the hosting institution or the 
individual heading the hosting proposal must be a confirmed year 2014 
member of the EAMT at submission time.

Students must be confirmed members of the EAMT at application time.
Membership information: http://www.eamt.org/membership.php


 Selection criteria

 * The proposed activity should be of direct interest to the MT
   community at large: researchers, developers, vendors or users of MT
   technologies.
 * The proposal shall clearly describe the purpose of the internship,
   and include a detailed work plan
 * Proposals with a significant, clearly identified impact on the MT
   community (through the development, dissemination or use of project
   results) are most likely to be accepted.
 * Proposals that bring together different aspects of MT will be
   specially valued.
 * The proposal should be clearly justified as being technically and/or
   scientifically sound.
 * The quality and efficiency of the implementation of the proposal
   will be evaluated.
 * The budget should be adequate for the proposed objectives and the
   actual implementation of the internship.


 Budget

EAMT anticipates funding several summer internships.
Applications by hosting institutions or companies requesting partial 
funding to match their own internal funding will be given preference, 
but well-justified requests for full funding will be carefully considered.
The total foreseen EAMT budget for this call is around EUR10,000. The 
maximum amount EAMT can grant for a single internship will be EUR1,500 
per month, which should cover all expenses (see budget details below).
Hosting institutions or companies will receive 50% of the internship 
before the internship starts, and the remaining 50% during the 
internship, after a brief progress report by the person responsible at 
the hosting institution or company.



   Contact for enquiries

For general enquiries please contact:
Mikel L. Forcada
EAMT Secretary
e-mail: m...@ua.es


   Submission procedure


 Overview

Hosting institutions or companies should respond to the call by 
submitting a single PDF document, written in English, that is composed 
of the following elements:


1. Proposal summary: 1 page maximum
2. Experience of the host: 1 page maximum
3. Detailed description of work: 2 pages maximum
4. Budget: 2 pages maximum.

Submit your proposals as a single PDF file no later than the deadline 
(see Important Dates below) through EasyChair: 
http://www.easychair.org/conferences/?conf=eamt2013proposals

Mark it clearly as a proposal to host a summer internship.


 Detailed description of sections of the proposal by the hosting
 institution or company

1. Proposal summary (one page) in English.
 * Complete contact information of institution or company and
   mentoring person
 * Description of the activity to be performed by the 

[Moses-support] EAMT sponsorship of activities: call for proposals 2013

2013-09-02 Thread Mikel Forcada


 **EAMT sponsorship of activities


 Call for proposals 2013

The European Association for Machine Translation (EAMT, 
http://www.eamt.org) is an organization that serves the growing 
community of people interested in MT and translation tools, including 
users, developers, and researchers of this increasingly viable technology.
As part of its commitment to promote research, development and awareness 
about translation technologies, the EAMT is for the fourth consecutive 
year launching a call for proposals to fund MT-related activities.



   Purpose of the Call

The EAMT is planning to support various MT activities such as tutorials, 
workshops, teaching and awareness initiatives, open-source initiatives, 
small research and development projects by its current members.
*The proposed activity should be of direct interest to the MT 
community*: researchers, developers, vendors or users of MT technologies.

The EAMT particularly encourages proposals from young members.
Topics of interest include, but are not limited to:

 * Recent developments in MT research.
 * MT evaluation methodology, metrics and results.
 * Launch of MT-specific evaluation campaigns.
 * New or prospective commercial users of MT technology.
 * MT environments (workflow, support tools, etc.).
 * Interaction between users and MT systems.
 * MT combined with other technologies (translation memories, speech
   translation, cross-language information retrieval, multilingual text
   categorization, multilingual text summarization, etc.).
 * MT for less-resourced languages: development, usage, etc.
 * MT in the social internet: new uses, new modes of development.

All proposals will be screened by a review committee that consists of 
EAMT Executive Committee members and possibly a few appointed external 
experts if necessary.



   Submission information


 Eligibility requirements

In order to qualify for funding, the institution(s) or the individual 
making the proposal must be a confirmed member of the EAMT at submission 
time.

Membership information: http://www.eamt.org/membership.php


 Selection criteria

 * The proposed activity should be of direct interest to the MT
   community at large: researchers, developers, vendors or users of MT
   technologies.
 * The proposal shall clearly describe the purpose of the project and
   include measurable mid-project milestones for which a report should
   be submitted (see below).
 * Preference will be given to projects which by nature will involve
   and be beneficial for several persons, as for instance conferences,
   seminars, workshops and tutorials.
 * Proposals with a significant, clearly identified impact on the MT
   community (through the development, dissemination or use of project
   results) are most likely to be accepted.
 * Proposals that bring together different aspects of MT will be
   specially valued.
 * The proposal should be clearly justified as being technically and/or
   scientifically sound.
 * The quality and efficiency of the implementation of the proposal
   will be evaluated.
 * The budget should be adequate for the proposed objectives and the
   actual implementation of the activity.


 Budget

EAMT anticipates funding several proposals for various activities. There 
are two categories of proposals. The member institutions' category and 
the individual members' category.
The total foreseen EAMT Budget for this call is around EUR20,000 to 
cover all granted projects (tutorials, workshops, teaching and awareness 
initiatives, open-source initiatives, and research and development 
projects). The maximum amount EAMT can grant for a single project will 
be EUR10,000.
A project being granted financial support by EAMT according to this call 
will receive 50% of the granted amount at the start of the project. The 
proposer will receive the remaining 50% when the mid-project progress 
report has been received by the EAMT Secretary and substantiates that 
the mid-project milestones are met, and furthermore provided that the 
proposer is still a current member of the EAMT.



 Contact for enquiries

For general enquiries please contact:
Mikel L. Forcada
EAMT Secretary
e-mail: m...@ua.es


   Submission procedure


 Overview

Candidates should respond to the call by submitting a single PDF 
document, written in English, that is composed of the following elements:


1. Proposal summary: 1-page maximum
2. Detailed proposal description: 5-page maximum
3. Budget and project planning overview: 1-page maximum

Submit your proposals as a single PDF file no later than the deadline 
(see Important Dates below) through EasyChair: 
http://www.easychair.org/conferences/?conf=eamt2013proposals



 Detailed description of sections of the proposal

1. Proposal summary (one page) in English.
 * Complete contact information of candidate
 * Description of the activity or the event.
 * Budget summary and identification of the support requested from
   the EA

Re: [Moses-support] Tuning and decoding of lattices in the new Moses.

2013-09-02 Thread Hieu Hoang
Hi Yulia


On 1 September 2013 22:46, Yulia Tsvetkov  wrote:

> Dear Moses developers,
>
> I am trying to use the a new version of Moses, seems like things have
> changed quite a bit and I have hard time finding an up-to-date
> documentation. For debugging I used very small train/tune/test corpora (10
> lines each).
>
> First thing is running the following command produces a phrase table with
> only 4 features:
> train-model.perl --root-dir $root_dir --corpus $root_dir/$corpus_name  --f
> $src_lng --e $trg_lng --alignment grow-diag-final --lm 0:3:$LM
> -external-bin-dir $external_bin_dir`;
>
> Here is a snippet from a produced moses.iniPhraseDictionaryMemory
> name=TranslationModel0 table-limit=20 *num-features=4 
> *path=/usr1/projects/mt_proj/mt_eval/baselines/fr-base-1-lats/model/phrase-table.gz
> input-factor=0 output-factor=0
>

Yes, the phrase-table now has 4 scores, instead of 5. The 5th score was a
constant 2.718. This has now moved into it's own feature function,
PhrasePenalty.

it save 3% of disk space, and i think is better for research. eg. create
better, non-constant phrase penalty feature functions, if we have 2 phrase
tables do we need just 1 phrase penalty? etc.


> Second, I am trying to run tuning and decoding of lattices in plf format.
> Can you point me to example commands and moses.ini for running mert and
> decoding lattices with the new Moses?
>
an example ini file for lattices can be seen here

https://github.com/moses-smt/moses-regression-tests/blob/master/tests/phrase.lattice-surface/moses.ini

Mert should run like it has always did. However, if you upgrade the
decoder, you should use the upgraded mert script too.

Decoding with lattice is exactly the same as for a sentence, except 2 things
   1. inputtype=2. This can be on the command line of in the ini file, eg.
   ./moses -inputtype 2

   or
[inputtype]
2

   2. You should use the InputFeature feature function. This is the score
of the path through the lattice. You can see the InputFeature in the ini
file:
  [feature]
  
  InputFeature num-features=1 num-input-features=1 real-word-count=0

  [weight]
  ...
  InputFeature0 = 1

   Before the refactoring, this was hacked into as an extra feature in the
phrase-table

>
> So far I tried training and tuning on text files and decoding on lattices
> because I could not figure out the right settings for tuning.
> According to some old documentation I am supposed to convert the phrase
> table to a binary format. Is it still needed?
>
You no longer need to convert it to binary format. It's good to convert to
binary format to save memory, but it is not required. Lattice decoding
works with all phrase-table implmentations now

>
> When I ran it with the following command:
> moses *-inputtype 2 -weight-i 0.62 -weight-l 12.5* -f $tune_dir/moses.ini
> < $eval_dir/69.plf > $eval_dir/69.plf.out
> I got an error:
> *Don't mix old and new ini file format*
> What is the new equivalent of weight-i and weight-l?
>

   -weight-i 0.62
now becomes
   -weight-overwrite 'InputFeature0= 0.62'

  -weight-l 12.5
now becomes
   -weight-overwrite 'LM0= 12.5'

The updated mert script should be doing this anyway.

>
> Without those parameters I get a Segmentation Fault with both a .gz and a
> binary phrase table.
>

if you're still having problems, give me your ini file and exact command
you're executing and i'll try and debug it

>
> Could you help me figuring out the right settings?
>
> Thanks in advance.
>
> ___
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>


-- 
Hieu Hoang
Research Associate
University of Edinburgh
http://www.hoang.co.uk/hieu
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] [SPAM?]: build a demonstration system

2013-09-02 Thread Christian Buck
Hi Wonyoung,

These is a thin wrapper around mosesserver which you can use to provide 
a service similar to the Google Translatate API:

https://github.com/christianbuck/matecat_util/tree/master/python_server

I've put up a short README that covers basic usage.

cheers,
Christian

On 02/09/13 07:59, 서원영 wrote:
> Hi,
>
> I am just a beginner in Moses and I am trying to build a demonstration
> system by using mosesserver.cpp.
>
> After I tested server program, I've noticed that there are no tokenizing
> and truecasing step in mosesserver.cpp.
>
> So, Is there anybody who met a same problem with me?
>
> I am thinking about inserting those steps in mosesserver.cpp.
>
> I wish I could hear about this idea from other people.
>
> Thanks and Best regards,
>
> *Engineer
> Wonyoung Seo
> *
>
> Web Platform Lab / Software R&D Center / *SAMSUNG ELECTRONICS CO,. LTD*
> *TEL/Mobile* : +82-10-9626-3686 / *E-mail* : roarman@samsung.com
>
>
>
> ___
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] build a demonstration system

2013-09-02 Thread 서원영
Title: Samsung Enterprise Portal mySingle


Hi Christian,
 
I really appreciate your advice. It would be great help for me.
Thanks again,
 
Won-Young
 
--- Original Message ---
Sender : Christian Buck
Date : 2013-09-03 02:36 (GMT+09:00)
Title : Re: [SPAM?]: [Moses-support] build a demonstration system
 Hi Wonyoung,These is a thin wrapper around mosesserver which you can use to provide a service similar to the Google Translatate API:https://github.com/christianbuck/matecat_util/tree/master/python_serverI've put up a short README that covers basic usage.cheers,ChristianOn 02/09/13 07:59, 서원영 wrote:> Hi,>> I am just a beginner in Moses and I am trying to build a demonstration> system by using mosesserver.cpp.>> After I tested server program, I've noticed that there are no tokenizing> and truecasing step in mosesserver.cpp.>> So, Is there anybody who met a same problem with me?>> I am thinking about inserting those steps in mosesserver.cpp.>> I wish I could hear about this idea from other people.>> Thanks and Best regards,>> *Engineer> Wonyoung Seo> *>> Web Platform Lab / Software R&D Center / *SAMSUNG ELECTRONICS CO,. LTD*> *TEL/Mobile* : +82-10-9626-3686 / *E-mail* : roarman@samsung.com ___> Moses-support mailing list> Moses-support@mit.edu> http://mailman.mit.edu/mailman/listinfo/moses-support>
 
 EngineerWonyoung Seo   
Web Platform Lab / Software R&D Center / SAMSUNG ELECTRONICS CO,. LTD TEL/Mobile : +82-10-9626-3686 / E-mail : roarman@samsung.com
 




___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support