[Moses-support] EAMT 2014: Call for Papers, deadline March 28

2014-03-10 Thread Philipp Koehn
EAMT 2014 Call for Papers

Annual Conference of the European Association for Machine Translation
(EAMT2014)
Dubrovnik, Croatia
16-18 June 2014
http://hnk.ffzg.hr/eamt2014/home.html

The European Association for Machine Translation (EAMT) invites
everyone interested in machine translation and translation-related
tools and resources to participate in this conference -- developers,
researchers, users (including professional translators and
translation/localisation managers): anyone who has a stake in the
vision of an information world in which language issues become less
visible to the information consumer. We especially invite researchers
to describe the state of the art and demonstrate their cutting-edge
results and avid MT users to share their experiences.

We expect to receive manuscripts in these three categories:

===

(R) Research papers: Long-paper submissions (8 pages) are invited for
reports of significant research results in any aspect of machine
translation and related areas. Such reports should include a
substantial evaluation component. Contributions are welcome on all
topics in the area of Machine Translation or translation-related
technologies, including:

MT methodologies and techniques
Speech translation: speech to text, speech to speech
Translation aids (translation memory, terminology databases, etc.)
Translation environments (workflow, support tools, conversion
tools for lexica, etc.)
Practical MT systems (MT for professionals, MT for multilingual
eCommerce, MT for localization, etc.)
MT in multilingual public service (eGovernment etc.)
MT for the web
MT embedded in other services
MT evaluation techniques and evaluation results
Dictionaries and lexica for MT
Text and speech corpora for MT
Standards in text and lexicon encoding for MT
Human factors in MT and user interfaces
Related multilingual technologies (natural language generation,
information retrieval, text categorization, text summarization,
information extraction, etc.)

Papers should describe original work. They should emphasize completed
work rather than intended work, and should indicate clearly the state
of completion of the reported results. Where appropriate, concrete
evaluation results should be included.

===

(U) User studies: Short-paper submissions (2-4 pages) are invited for
reports on users' experiences with MT, be it in small or medium size
business (SMB), enterprise, government, or NGOs. Contributions are
welcome on:

Integrating MT and computer-assisted translation into a
translation production workflow (e.g. transforming terminology
glossaries into MT resources, optimizing TM/MT thresholds, mixing
online and offline tools, using interactive MT, dealing with MT
confidence scores)
Use of MT to improve translation or localization workflows (e.g.
reducing turnaround times, improving translation consistency,
increasing the scope of globalization projects)
Managing change when implementing and using MT (e.g. switching
between multiple MT systems, limiting degradations when updating or
upgrading an MT system)
Implementing open-source MT in the SMB or enterprise (e.g.
strategies to get support, reports on taking pilot results into full
deployment, examples of advance customisation sought and obtained
thanks to the open-source paradigm)
Evaluation of MT in a real-world setting (e.g. error detection
strategies employed, metrics used, productivity or translation quality
gains achieved)
Post-editing strategies and tools (e.g. limitations of traditional
translation quality assurance tools, challenges associated with
post-editing guidelines)
Legal issues associated with MT, especially MT in the cloud (e.g.
copyright, privacy)
Use of MT in social networking or real-time communication (e.g.
enterprise support chat)
Use of MT to process multilingual content for assimilation
purposes (e.g. cross-lingual information retrieval, MT for
e-discovery, MT for spam detection)
Use of standards for MT

Papers should highlight problems and solutions and not merely describe
MT integration process or project settings. Where solutions do not
seem to exist, suggestions for MT researchers and developers should be
clearly emphasized. For user papers produced by academics, we require
co-authorship with the actual users.

(P) Project/Product description: Abstract submissions (1 page) are
invited to report new, interesting:

Tools for machine translation, computer aided translation, and the
like (including commercial products and open source software). The
authors should be ready to present the tools in the form of demos or
posters during the conference.
Research projects related to machine translation. The authors
should be ready to present the projects in the form of posters during
the conference. This fol

[Moses-support] Problem in web based SMT System using MOSES

2014-03-10 Thread विशाल गोयल
Hello All,
Greetings.
We have developed SMT Approach based Hindi to Punjabi MT System using
Moses. Now We have hosted on the Linux server  and hosted required tool
kits including MOSES etc.

We are facing a problem that When small text is given as input for
translation, it will translate but when it get number of requests for
translations in one go, It gets busy and Only starts doing transliteration
rather than translation...
It has been developed using CGI-Perl and has been installed at
http://tdil-dc.in/hi2pu

Please help us in solving our problem.

Thanks in anticipation...
Ajit- Please remain in contact with replies from the researchers of this
group...

-- 
*Regards,*
Vishal Goyal,
Ph.D., M.Tech., MCA, M.C.S.D.
Assistant Professor(Stage III) and Placement Coordinator,
Department of Computer Science,
Punjabi University Patiala-147002
[*Online Hindi to Punjabi Machine Translation Tool -*
http://h2p.learnpunjabi.org ]
*[Research Cell: An International Journal of Engineering Sciences,
http://ijoes.vidyapublications.com ]*
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] Reg : Segmentation Fault while loading

2014-03-10 Thread Rajen Chatterjee
Hi Kunal,
 I would suggest you not to edit phrase table, rather clean your
training corpus and retrain the system. If you edit phrases then it may
lead to inconsistent probability score and alignment in the phrase table.


On Mon, Mar 10, 2014 at 4:05 PM, Marcin Junczys-Dowmunt
wrote:

>  Hi Kunal,
> I am not sure this is causing the segfault, but take a look at the
> alignment. You are removing tokens from the source phrase, before 5, now 3,
> but you leave the alignment untouched which still maps 5 to 5 tokens. You
> would either need to shift and remove the alignment points or get rid of
> the alignment altogether.
> Best,
> Marcin
>
> W dniu 10.03.2014 16:01, Kunal Sachdeva pisze:
>
>Hello all,
>  i tried to clean my phrase table my removing some characters from
> phrases, like
>
>  '  ' ' ' comments Remain Rolland ||| - - रोम्यां रोलां ने ||| 0.112086
> 2.80203e-05 0.0373621 8.65407e-07 2.718 0.030868 ||| 0-1 0-2 1-2 2-2 4-2
> 3-3 4-3 4-4 ||| 1 3 1
>
>  changed to >
>
> comments Remain Rolland ||| - - रोम्यां रोलां ने ||| 0.112086 2.80203e-05
> 0.0373621 8.65407e-07 2.718 0.030868 ||| 0-1 0-2 1-2 2-2 4-2 3-3 4-3 4-4
> ||| 1 3 1
>
>  so when now i am try to decode a sentence, i get an eroor
>
> Reading /home/seecat/WMT/eng-hin/WMT_new/model/temp1.gz
>
> 5---10---15---20---25---30---35---40---45---50---55---60---65---70---75---80---85---90---95--100
> ***Segmentation fault (core dumped)
>
>  this error doesn't come always, when i try to use it with -v(verbose)
> option.
>
>  It will be great, if someone could help me.
>
>  Regards,
> Kunal Sachdeva
>
>
> ___
> Moses-support mailing 
> listMoses-support@mit.eduhttp://mailman.mit.edu/mailman/listinfo/moses-support
>
>
>
> ___
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>


-- 
-Regards,
 Rajen Chatterjee.
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] Reg : Segmentation Fault while loading

2014-03-10 Thread Marcin Junczys-Dowmunt

Hi Kunal,
I am not sure this is causing the segfault, but take a look at the 
alignment. You are removing tokens from the source phrase, before 5, now 
3, but you leave the alignment untouched which still maps 5 to 5 tokens. 
You would either need to shift and remove the alignment points or get 
rid of the alignment altogether.

Best,
Marcin

W dniu 10.03.2014 16:01, Kunal Sachdeva pisze:

Hello all,
i tried to clean my phrase table my removing some characters from 
phrases, like


 '  ' ' ' comments Remain Rolland ||| - - ??? ? ?? ||| 
0.112086 2.80203e-05 0.0373621 8.65407e-07 2.718 0.030868 ||| 0-1 0-2 
1-2 2-2 4-2 3-3 4-3 4-4 ||| 1 3 1


changed to >

comments Remain Rolland ||| - - ??? ? ?? ||| 0.112086 
2.80203e-05 0.0373621 8.65407e-07 2.718 0.030868 ||| 0-1 0-2 1-2 2-2 
4-2 3-3 4-3 4-4 ||| 1 3 1


so when now i am try to decode a sentence, i get an eroor

Reading /home/seecat/WMT/eng-hin/WMT_new/model/temp1.gz
5---10---15---20---25---30---35---40---45---50---55---60---65---70---75---80---85---90---95--100
***Segmentation fault (core dumped)

this error doesn't come always, when i try to use it with -v(verbose) 
option.


It will be great, if someone could help me.

Regards,
Kunal Sachdeva


___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


[Moses-support] Reg : Segmentation Fault while loading

2014-03-10 Thread Kunal Sachdeva
Hello all,
i tried to clean my phrase table my removing some characters from phrases,
like

 '  ' ' ' comments Remain Rolland ||| - - रोम्यां रोलां ने ||| 0.112086
2.80203e-05 0.0373621 8.65407e-07 2.718 0.030868 ||| 0-1 0-2 1-2 2-2 4-2
3-3 4-3 4-4 ||| 1 3 1

changed to >

comments Remain Rolland ||| - - रोम्यां रोलां ने ||| 0.112086 2.80203e-05
0.0373621 8.65407e-07 2.718 0.030868 ||| 0-1 0-2 1-2 2-2 4-2 3-3 4-3 4-4
||| 1 3 1

so when now i am try to decode a sentence, i get an eroor

Reading /home/seecat/WMT/eng-hin/WMT_new/model/temp1.gz
5---10---15---20---25---30---35---40---45---50---55---60---65---70---75---80---85---90---95--100
***Segmentation fault (core dumped)

this error doesn't come always, when i try to use it with -v(verbose)
option.

It will be great, if someone could help me.

Regards,
Kunal Sachdeva
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] Help. First request to MosesServer very slow

2014-03-10 Thread Marcin Junczys-Dowmunt
Hi Marcos,
the command line tool behaviour you are describing is actually 
consistent with what Barry said. You would see this lag before the first 
couple of translations and later it would disappear. Usually you do not 
use the command line tool for only 4 sentences, but for a couple of 
hundreds or thousands. For instance if you would create a file that 
contains the same sentences say 10 times you would only see the lag for 
the first. Your problem is, that MosesSever basically destroys the cache 
after each request, so for each new request it has to fill it again. 
Hence the delay.
Best,
Marcin

W dniu 10.03.2014 09:22, Marcos Fernandez pisze:
>> Hi Marcos
>>
>> I think the problem is that the rules (or phrase pairs) are now cached
>> on a per thread basis. This is good for command-line Moses as it uses a
>> pool of threads, and having per-thread caches means that there is no
>> locking on the caches, as there used to be.
> Barry, I am not sure that that is the cause, because in that case my "20
> short sentences" file would get translated much faster with command-line
> Moses, however it takes 4.5 seconds to translate it. This is, 2 seconds
> (measured) for loading tables in memory, and the same 2.5 seconds than with
> MosesServer to translate the file. And the behaviour is also the same as
> before, the first sentence in the file takes much longer than the rest.
>
> However, what you say perhaps could be the cause of the difference in time
> between using a different xmlrpc ServerProxy object for each request
> (probably, in this case xmlrpc executes each request in a different thread),
> or reusing one only ServerProxy for all the requests (where there would be
> only one thread, so it could take advantage of the cache).
>
> What I understand then, is that the cache stores information of the
> previously translated sentences to accelerate the translation of the next
> ones. But that does not eliminate the problem of the first slower request.
> As you can see, I am finding that issue even in command-line Moses (not with
> the first "request", but with the first sentence in a file).
>
> I am thinking that perhaps I have no problem, and this is just the usual way
> in which Moses works. Just to make sure:
>
> 1. would you say that a time of 2-3 seconds for translating (spa-eng) a
> single sentence (~15-20 words) could be a normal response time (discounting
> the time of loading tables)? (Intel Xeon with 32GB RAM)
> 2. if now you write 3 similar sentences in a file, and execute Moses
> (command-line, serial) over this file, would you expect it to take a much
> shorter time (perhaps the half) than the sum of the times for the 3 single
> sentences?
>
> If the answer is yes to both (specially the second), then I am probably
> worrying in vain. My worries started when I read the "MTMonkey" paper:
> http://ufal.mff.cuni.cz/~pecina/files/pbml-2013.pdf
>
> Here the authors use the approach of creating a new ServerProxy instance
> each time a request is sent to MosesServer (the worst case scenario for me),
> and they get great results, so I thought they were not experiencing that
> overhead for every request. But perhaps they just used sentences that get
> translated very fast even with that overhead.
>
> Well, in the case that this is the usual way of working for Moses, if what
> Kenneth suggests is possible, that would eliminate the overhead almost
> completely for MosesServer. I mean, there would still be an overhead for the
> first request that each thread serves after the server is just launched, but
> never more, because the caches would be filled with useful information from
> that point on. I think that this would be extremely interesting for web
> translating services, or to do web-page translations "on the fly".
>
> However, I don't see a way to avoid that overhead in command-line Moses, as
> it "dies" after each execution.
>
> Marcos.
>
>
>> mosesserver, afaik, creates a new thread for each connection, so it
>> can't take advantage of the cache. This is done in the xmlrpc-c library
>> so we don't have much control over it. If you dig around in the xmlrpc-c
>> documentation (or code!) you might find a way to control the threading
>> policy.
>>
>> I just spoke to Marcin about the problem, and we're not sure if loading
>> the compact phrase table into memory would help, as you still would need
>> the higher level cache (in PhraseDictionary). But you could try this anyway.
>>
>> cheers - Barry
>>
>> On 06/03/14 17:20, Marcos Fernandez wrote:
>>> Hi, I am having an issue with MosesServer.
>>>
>>> I am using compact phrase and reordering table, and KENLM.
>>>
>>> The problem is this (I'll explain with an example):
>>>
>>> - I have one file with 20 very short sentences. I split and tokenize
>>> them and send one XMLPRC request per sentence to MosesServer
>>> - If I create just one XMLRPC ServerProxy instance and I use it to send
>>> all the requests through it, all the sentences get translated in approx
>>

[Moses-support] Problem faced in SMT Approach MT SYSTEM using MOSES

2014-03-10 Thread विशाल गोयल
> Hello All,
> Greetings. We have developed SMT Approach based Hindi to Punjabi MT System
> using Moses. Now We have hosted on the Linux server  and hosted required
> tool kits including MOSES etc.
>
> We are facing a problem that When small text is given as input for
> translation, it will translate but when it get number of requests for
> translations in one go, It gets busy and Only starts doing transliteration
> rather than translation...
> It has been developed using CGI-Perl and has been installed at
> http://tdil-dc.in/hi2pu
>
> Please help us in solving our problem.
>
> Thanks in anticipation...
> Ajit- Please remain in contact with replies from the researchers of this
> group...
>
>
>
>
-- 
*Regards,*
Vishal Goyal,
Ph.D., M.Tech., MCA, M.C.S.D.
Assistant Professor(Stage III) and Placement Coordinator,
Department of Computer Science,
Punjabi University Patiala-147002
[*Online Hindi to Punjabi Machine Translation Tool -*
http://h2p.learnpunjabi.org ]
*[Research Cell: An International Journal of Engineering Sciences,
http://ijoes.vidyapublications.com ]*
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] Help. First request to MosesServer very slow

2014-03-10 Thread Marcos Fernandez
> Hi Marcos
> 
> I think the problem is that the rules (or phrase pairs) are now cached 
> on a per thread basis. This is good for command-line Moses as it uses a 
> pool of threads, and having per-thread caches means that there is no 
> locking on the caches, as there used to be.

Barry, I am not sure that that is the cause, because in that case my "20
short sentences" file would get translated much faster with command-line
Moses, however it takes 4.5 seconds to translate it. This is, 2 seconds
(measured) for loading tables in memory, and the same 2.5 seconds than with
MosesServer to translate the file. And the behaviour is also the same as
before, the first sentence in the file takes much longer than the rest.

However, what you say perhaps could be the cause of the difference in time
between using a different xmlrpc ServerProxy object for each request
(probably, in this case xmlrpc executes each request in a different thread),
or reusing one only ServerProxy for all the requests (where there would be
only one thread, so it could take advantage of the cache).

What I understand then, is that the cache stores information of the
previously translated sentences to accelerate the translation of the next
ones. But that does not eliminate the problem of the first slower request.
As you can see, I am finding that issue even in command-line Moses (not with
the first "request", but with the first sentence in a file).

I am thinking that perhaps I have no problem, and this is just the usual way
in which Moses works. Just to make sure:

1. would you say that a time of 2-3 seconds for translating (spa-eng) a
single sentence (~15-20 words) could be a normal response time (discounting
the time of loading tables)? (Intel Xeon with 32GB RAM)
2. if now you write 3 similar sentences in a file, and execute Moses
(command-line, serial) over this file, would you expect it to take a much
shorter time (perhaps the half) than the sum of the times for the 3 single
sentences?

If the answer is yes to both (specially the second), then I am probably
worrying in vain. My worries started when I read the "MTMonkey" paper:
http://ufal.mff.cuni.cz/~pecina/files/pbml-2013.pdf

Here the authors use the approach of creating a new ServerProxy instance
each time a request is sent to MosesServer (the worst case scenario for me),
and they get great results, so I thought they were not experiencing that
overhead for every request. But perhaps they just used sentences that get
translated very fast even with that overhead.

Well, in the case that this is the usual way of working for Moses, if what
Kenneth suggests is possible, that would eliminate the overhead almost
completely for MosesServer. I mean, there would still be an overhead for the
first request that each thread serves after the server is just launched, but
never more, because the caches would be filled with useful information from
that point on. I think that this would be extremely interesting for web
translating services, or to do web-page translations "on the fly".

However, I don't see a way to avoid that overhead in command-line Moses, as
it "dies" after each execution.

Marcos.


> mosesserver, afaik, creates a new thread for each connection, so it 
> can't take advantage of the cache. This is done in the xmlrpc-c library 
> so we don't have much control over it. If you dig around in the xmlrpc-c 
> documentation (or code!) you might find a way to control the threading 
> policy.
> 
> I just spoke to Marcin about the problem, and we're not sure if loading 
> the compact phrase table into memory would help, as you still would need 
> the higher level cache (in PhraseDictionary). But you could try this anyway.
> 
> cheers - Barry
> 
> On 06/03/14 17:20, Marcos Fernandez wrote:
> > Hi, I am having an issue with MosesServer.
> >
> > I am using compact phrase and reordering table, and KENLM.
> >
> > The problem is this (I'll explain with an example):
> >
> > - I have one file with 20 very short sentences. I split and tokenize
> > them and send one XMLPRC request per sentence to MosesServer
> > - If I create just one XMLRPC ServerProxy instance and I use it to send
> > all the requests through it, all the sentences get translated in approx
> > 2.5 sec. The problem is that the first sentence takes almost 2 seconds
> > to get translated, while the other 19 are much faster
> > - If I create one ServerProxy instance per request, the translation time
> > rises to 30 sec (now every sentence takes almost 2 sec)
> >
> > I don't understand the reason of that delay for the first request. I
> > have followed the source of this delay to the function:
> >
> > GetTargetPhraseCollectionLEGACY(const Phrase& src)
> >
> > in the file: ...TranslationModel/PhraseDictionary.cpp
> >
> > It seems that for the first request it's needed  to look for something
> > in the phrase table, while for subsequent requests it can be retrieved
> > (most of the times) from a cache.
> >
> > But, as the sentences in my file