Re: [Moses-support] the order of tuning and binarising phrase table

2017-12-22 Thread Manli Zhu
Thanks for confirming that.

I learnt from the baseline system:
http://www.statmt.org/moses/?n=Moses.Baseline,  it shows the step-by-step
instruction, tuning before binarising.


Now, I binarised phrase table and reordering table, then tune with

binarised-model/moses.ini and add an option --no-filter-phrase-table,
now, the tuning process is running smoothly.





On Fri, Dec 22, 2017 at 7:19 PM, Hieu Hoang  wrote:

> Yes, its usual to binaries before tuning.where does it say otherwise?
>
> On 22 Dec 2017 6:57 pm, "Manli Zhu"  wrote:
>
>> Hi,
>>
>> After training, the baseline system introduces tuning then binarising
>> phrase table.
>>
>> Can we first binarise the phrase table then do the tuning? So the loading
>> for tuning can be faster.  Need to update the moses.ini accordingly for the
>> path of reorder and phrase table before tuning. Could you please confirm?
>>
>> Thank you
>>
>>
>> ___
>> Moses-support mailing list
>> Moses-support@mit.edu
>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>
>>
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


[Moses-support] 1st CFP: AMTA 2018 Workshop on Technologies for MT of Low Resource Languages (LoResMT 2018)

2017-12-22 Thread Chao-Hong
[Apologies for multiple postings]


  Workshop on Technologies for MT of Low Resource Languages (LoResMT 2018)
 Boston, Massachusetts, March 21, 2018
 https://sites.google.com/view/loresmt/
@ AMTA 2018 (http://www.conference.amtaweb.org/)


SCOPES

Statistical and neural machine translation (SMT/NMT) methods have been
successfully used to build MT systems in many popular languages in the last
two decades with significant improvements on the quality of automatic
translation.  However, these methods still rely upon a few natural language
processing (NLP) tools to help pre-process human generated texts in the
forms that are required as input for these methods, and/or post-process the
output in proper textual forms in target languages.

In many MT systems, the performance of these tools has great impacts on the
quality of resulting translation.  However, there is not much discussion on
these NLP tools, their methods, their roles in different MT systems of
diverse methods, their coverage of support for the many languages in the
world, etc.  In this workshop, we would like to bring together researchers
who work on these topics and help review/overview what are the most
important tasks we need from these tools for MT in the following years.

These NLP tools include, but not limited to, several kinds of word
tokenizers/de-tokenizers, word segmenters, morphology analysers, etc.  In
this workshop, we solicit papers dedicated to these supplementary tools
that are used in any language and especially in low resource languages.  We
would like to have an overview of these NLP tools from our community.  The
evaluations of these tools in research papers should include how they have
improved the quality of MT output.

TOPICS

We solicit original research papers, review papers as well as position
papers on these tools in the workshop.  Multilingual and/or Cross-lingual
NLP tools for MT of low resource languages are especially welcome.  Topics
of the workshop include but not limited to

-  Research and review papers of pre-process and/or post-process NLP tools
for MT
-  Position papers on the development of pre-process and/or post-process
tools for MT
-  Word tokenizers/de-tokenizers for specific languages
-  Word/morpheme segmenters for specific languages
-  Alignment/Re-ordering tools for specific language-pairs
-  Use of morphology analysers and/or morpheme segmenters for MT
-  Multilingual and/or Cross-lingual NLP tools for MT
-  Reusability of existing NLP tools for low resource languages
-  Corpora curation technologies for low resource languages
-  Review of available parallel corpora for low resource languages
-  Research and review papers of MT methods for low resource languages
-  Fast building of MT systems for low resource languages
-  Reusability of existing MT systems for low resource languages

IMPORTANT DATES

December 22, 2017: First call for papers
January 8, 2018: Second call for papers
February 4, 2018: Submission deadline of workshop papers
February 11, 2018: Notification of acceptance
February 16, 2018: Camera-ready papers due
March 21, 2018: LoResMT workshop

CONTACT

chaohong@adaptcentre.ie

*Chao-Hong Liu* | Marie Skłodowska-Curie fellow (MSCA RISE)
ADAPT Centre
School of Computing m: +353 (0) 89 247 3035 <+353%2089%20247%203035>
Dublin City University e: chaohong@adaptcentre.ie
Dublin 9, Ireland www.adaptcentre.ie




___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] the order of tuning and binarising phrase table

2017-12-22 Thread Hieu Hoang
Yes, its usual to binaries before tuning.where does it say otherwise?

On 22 Dec 2017 6:57 pm, "Manli Zhu"  wrote:

> Hi,
>
> After training, the baseline system introduces tuning then binarising
> phrase table.
>
> Can we first binarise the phrase table then do the tuning? So the loading
> for tuning can be faster.  Need to update the moses.ini accordingly for the
> path of reorder and phrase table before tuning. Could you please confirm?
>
> Thank you
>
>
> ___
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


[Moses-support] the order of tuning and binarising phrase table

2017-12-22 Thread Manli Zhu
Hi,

After training, the baseline system introduces tuning then binarising
phrase table.

Can we first binarise the phrase table then do the tuning? So the loading
for tuning can be faster.  Need to update the moses.ini accordingly for the
path of reorder and phrase table before tuning. Could you please confirm?

Thank you
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] Moses2 - binarizing phrase table

2017-12-22 Thread Yasmin Moslem
Hi Hieu!

Many thanks! I highly appreciate your helpful assistance.

Have a great weekend!

Kind regards,
Yasmin

On 21 December 2017 at 13:48, Hieu Hoang  wrote:

> The Moses2 page contains details on what you need to change for
> lexicalised reordering
>
>http://www.statmt.org/moses/?n=Site.Moses2
>
> Here are some tips in getting training & tuning to run faster
>
>http://www.statmt.org/moses/?n=Moses.Optimize
>
> Moses2 can output n-best list so can be used for tuning. However I've
> never tried it so there might be problems. If you don't want to those
> problems you're better off stick to Moses(1)
>
> On 21/12/17 09:31, Yasmin Moslem wrote:
>
> Many thanks, Hieu!
>
> So that worked perfectly.
>
> In moses.ini, I changed PhraseDictionaryMemory to ProbingPT and changed
> its path to that of the output folder "integrated_phrase-reordering".
> When I run moses2 (~/mosesdecoder/bin/moses2 -f moses.ini), it works
> perfectly.
>
>
> >>> You may also need to change line 'LexicalReordering'
>
> Do you mean the path or something else? I kept the same "
> reordering-table.wbe-msd-bidirectional-fe.gz". I tried also to use "
> reordering-table.minlexr" (/home/ubuntu/working/binarise
> d-model/reordering-table) and it worked also. Which one should I use?
>
>
>
> Another question, please, is moses2 for translation only? Can it be used
> for the Tuning phase to make it faster? Also, is there a way to make other
> steps faster? Currently, they take a few days to finish.
>
>
> Many thanks,
> Yasmin
>
>
>
>
>
>
> On 20 December 2017 at 14:33, Hieu Hoang  wrote:
>
>> In the moses.ini file, change the line
>>[feature]
>>PhraseDictionaryMemory.
>> to
>>[feature]
>>ProbingPT
>> You may also need to change line 'LexicalReordering'
>>
>> For example, attached is my moses.ini file for moses2
>>
>> Hieu Hoang
>> http://moses-smt.org/
>>
>>
>> On 19 December 2017 at 11:38, Yasmin Moslem 
>> wrote:
>>
>>> Hello!
>>>
>>> I tried to binarize the phrase table for Moses2 as follows:
>>>
>>> MOSES_DIR=~/mosesdecoder
>>>
>>>
>>> $MOSES_DIR/scripts/generic/binarize4moses2.perl
>>> --phrase-table=/home/ubuntu/working/train/model/phrase-table
>>> .gz 
>>> --lex-ro=/home/ubuntu/working/train/model/reordering-table.wbe-msd-bidirectional-fe.gz
>>> --output-dir=/home/ubuntu/working/integrated_phrase-reordering/
>>> --num-lex-scores=6
>>>
>>>
>>>
>>> However, I got the following error:
>>>
>>> >> ERROR: compile contrib/sigtest-filter
>>>
>>>
>>>
>>> So I followed these steps to compile it with SALM:
>>>
>>> git clone https://github.com/moses-smt/salm
>>>
>>>
>>> export CPLUS_INCLUDE_PATH=/home/ubuntu/mosesdecoder/opt/include
>>>
>>> export LIBRARY_PATH=/home/ubuntu/mosesdecoder/opt/lib
>>>
>>>
>>> cd ~/salm/Distribution/Linux
>>>
>>> make
>>>
>>>
>>> cd ~/mosesdecoder/contrib/sigtest-filter
>>>
>>> make SALMDIR=/home/ubuntu/salm
>>>
>>>
>>>
>>> Then, I ran the binarizing command above again. It worked fine and the
>>> folder "integrated_phrase-reordering" was created. (By the way, I had to
>>> restart the machine after compiling or I would get errors.)
>>>
>>>
>>>
>>> My question is: Now, what is the next step? Apparently, the change is
>>> not reflected in moses.ini
>>>
>>>
>>>
>>> Many thanks,
>>>
>>> Yasmin
>>>
>>>
>>>
>>>
>>> ___
>>> Moses-support mailing list
>>> Moses-support@mit.edu
>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>>
>>>
>>
>
>
> ___
> Moses-support mailing 
> listMoses-support@mit.eduhttp://mailman.mit.edu/mailman/listinfo/moses-support
>
>
> --
> Hieu Hoanghttp://moses-smt.org/
>
>
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] writing a wrapper for the Moses decoder

2017-12-22 Thread Mathias Müller
Hi Ryan

(Did you see Tom’s message?)

Here is what I meant by writing a wrapper.

The moses decoder can read from STDIN, so it will translate

echo “this is a test” | moses -f moses.ini

Now if you have a script that does proper segmentation of a Thai input sentence:

echo “notproperlysegmentedThaisentence” | segment.sh
properly segmented Thai sentence

and you plug this in in the middle:

echo “notproperlysegmentedThaisentence” | segment.sh | moses -f moses.ini

And for English-Thai, you would need a script that undoes segmentation (= 
restores to the proper uses for spacing, thanks Tom for explaining this), and 
can also read from STDIN, write to STDOUT:

echo “this is a test” | moses -f moses.ini | unsegment.sh

For instance, it the task were to just remove spaces, 

echo “this is a test” | moses -f moses.ini | sed 's/ //g'

Regards!
Mathias

> On 22 Dec 2017, at 02:26, Ryan Coughlin  wrote:
> 
> Hi all,
> 
>   Mathias recommended to me that I should write a lite wrapper for the Moses 
> decoder. Is anyone aware of any documentation for doing such a thing? I'm not 
> able to find it with any kind of search.
> 
> thank you,
> Ryan
> ___
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support

___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support