[Moses-support] Call for Participation: 'Bringing MT to the User: Research Meets Translators' Third Joint EM+/CNGL Workshop (JEC2011)

2011-09-09 Thread Ventsislav Zhechev

CALL FOR PARTICIPATION

“Bringing MT to the User: Research Meets Translators”
Third Joint EM+/CNGL Workshop (JEC 2011)
http://web.me.com/emcnglworkshop/JEC2011

The EuroMatrixPlus Project (http://www.euromatrixplus.eu), the Centre for Next 
Generation Localisation (CNGL) (http://cngl.ie), the Directorate-General for 
Translation (DGT, European Commission) (http://ec.europa.eu/dgs/translation) 
and Autodesk (http://www.autodesk.ch) are co-organising the Third Joint 
EM+/CNGL Workshop (JEC 2011), entitled “Bringing MT to the User: Research Meets 
Translators”. 
The JEC 2011 workshop will be hosted by the Directorate General for Translation 
(DGT) (http://ec.europa.eu/dgs/translation) in Luxembourg on October 14th, 
2011. In keeping with previous JECs, the format of the workshop is highly 
interactive with research paper presentations, invited talks and a panel 
discussion.


PLEASE REGISTER
Attendance of the workshop is free. If you wish to participate, please register 
at http://web.me.com/emcnglworkshop/JEC2011/Registration.html


• Premise:
Recent years have seen a revolution in MT triggered by the emergence of 
statistical approaches and improvements in translation quality. MT (rule-based, 
statistical and hybrid) is now available for many languages for free (on the 
Web) or for a fee and MT technologies are making strong inroads into the 
corporate localisation and translation industries as well as large public and 
administrative organisations dealing with multi-lingual content. Open-source MT 
solutions are competing with proprietary products. Increasing numbers of 
(professional) translators are post-editing TM/MT output. MT is a reality for 
internet users accessing and gisting content which is not available in their 
native language.
At the same time, there has been a degree of disconnect between mainstream 
academic research and conferences on MT, often (and rightly so) focusing on 
algorithms to improve translation quality, and many of the important practical 
issues that need to be addressed to make MT maximally useful in real 
translation and localisation workflows, with human translators and users in 
general.

• Objectives:
JEC 2011 brings together translators, users, academic and industrial MT 
researchers and developers to discuss issues that are most important in real 
world industrial settings and applications involving MT, but currently 
under-represented in research circles.


• Invited Speakers:
Lucia Specia, RIILP, UK: "Quality Estimation for Machine Translation: Different 
Users, Different Needs"
Jörg Porsiel, Volkswagen, Germany: "Machine translation at Volkswagen"
Arle Lommel, GALA Global, USA: Panel Discussion Chair


• List of Accepted Papers for Oral Presentation:
1. "Online Self-Serve Access to State-of-the-Art SMT"
Andy Way, Kenny Holden, Lee Ball and Gavin Wheeldon
2. "User-Focused Task-Oriented MT Evaluation for Wikis: A Case Study"
Federico Gaspari, Antonio Toral, Sudip Kumar Naskar and Andy Way
3. "Towards Application of User-Tailored Machine Translation"
Andrejs Vasiļjevs, Raivis Skadiņš and Inguna Skadiņa
4. "A Review of Machine Translation Tools from a Post-Editing Perspective"
Lucas Vieira and Lucia Specia

• To be Confirmed for Oral Presentation:
5. "Putting Hybrid Machine Translation into Practice through Large-Scale 
Involvement of Human Translators in SomeProject"
Christian Federmann, Aljoscha Burchardt, Maja Popović, David Vilar and 
Eleftherios Avramidis
6. "Using Statistical Machine Translation for Computer-Aided Translation at the 
European Commission"
Andreas Eisele and Caroline Lavecchia

The preliminary workshop program is available at 
http://web.me.com/emcnglworkshop/JEC2011/Preliminary_Program.html
Abstracts of the accepted papers are available at 
http://web.me.com/emcnglworkshop/JEC2011/List_of_Accepted_Papers.html



• Deadlines (all 23:59 GMT -11):
30th September: Camera-ready deadline for accepted papers
6th October: On-line registration closes
14th October: Workshop takes place at DGT in Luxembourg


Workshop Chair:
Ventsislav Zhechev (Autodesk)

Workshop Senior PC:
Ventsislav Zhechev (Autodesk)
Andreas Eisele (DGT)
Philipp Koehn (Univ. of Edinburgh)
Josef van Genabith, Declan Groves (CNGL)

Program Committee:
Submitted papers were reviewed by a joint industry–academia committee.
Industry members:  Pedro L. Díez-Orzas (Linguaserve), Marc Dymetman (XRCE), 
Andreas Eisele (DGT of the EC), Daniel Grasmick (Lucy Software), Michael 
Jellinghaus (EU Parliament), Will Lewis (Microsoft), Yanjun Ma (Baidu), 
Alexandros Poulis (EU Parliament), Johann Roturier (Symantec), Andy Way 
(Applied Language Solutions), Zoran Zakic (DGT of the EC), Ventsislav Zhechev 
(Autodesk) 
Academic members: Michael Carl (CBS, Denmark), Jinhua Du (Xi’ian Univ. of 
Technology), Josef van Genabith (CNGL, EM+), Decla

Re: [Moses-support] Ignoring Symbols?

2011-09-09 Thread Achim Ruopp
Taylor,
You can install the dependencies via CPAN. On Ubuntu (I assume you are
Ubuntu/Debian based on your question about apt-get):
sudo cpan
> install 

Achim 


-Original Message-
From: moses-support-boun...@mit.edu [mailto:moses-support-boun...@mit.edu]
On Behalf Of Taylor Rose
Sent: Friday, September 09, 2011 2:04 PM
To: moses-support@mit.edu
Subject: Re: [Moses-support] Ignoring Symbols?

Achim,

I am trying to run your scripts but have dependency issues. I do not
know Perl very well. Is it possible to get the dependencies through
apt-get?

-- 
Taylor Rose
Machine Translation Intern
Language Intelligence


On Thu, 2011-09-08 at 14:30 -0400, Achim Ruopp wrote:
> Taylor,
> You can have a look at the M4Loc project http://code.google.com/p/m4loc/
> We are working on pre-/post-processing scripts to preserve inline
formatting
> like you describe. Moses itself has the option to wrap non-translatable
text
> like the tags in XML
> (http://www.statmt.org/moses/?n=Moses.AdvancedFeatures#ntoc4), but this
> doesn't address how to treat these tags during tokenization/recasing.
> 
> Achim 
> 
> 
> -Original Message-
> From: moses-support-boun...@mit.edu [mailto:moses-support-boun...@mit.edu]
> On Behalf Of Taylor Rose
> Sent: Thursday, September 08, 2011 11:30 AM
> To: moses-support@mit.edu
> Subject: [Moses-support] Ignoring Symbols?
> 
> Hello,
> 
> I've recently started working with Moses as part of my new internship.
> The company I work for uses in-house formatting tags on documents. (ie.
> paragraph, bold, indent, etc.) Is there a way I can make Moses ignore
> these and keep them in the correct position after translation? My first
> thoughts were to somehow tell Moses that  in English should
> translate to  in Spanish but I haven't found a way to do this if
> it is even possible.
> 
> I'm still learning Moses so please hold off on the RTFMs. The website is
> huge and I've only scratched the surface of the documentation. I would
> appreciate any links you could provide to relevant documents.
> 
> Thanks,

___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] Ignoring Symbols?

2011-09-09 Thread Taylor Rose
Achim,

I am trying to run your scripts but have dependency issues. I do not
know Perl very well. Is it possible to get the dependencies through
apt-get?

-- 
Taylor Rose
Machine Translation Intern
Language Intelligence


On Thu, 2011-09-08 at 14:30 -0400, Achim Ruopp wrote:
> Taylor,
> You can have a look at the M4Loc project http://code.google.com/p/m4loc/
> We are working on pre-/post-processing scripts to preserve inline formatting
> like you describe. Moses itself has the option to wrap non-translatable text
> like the tags in XML
> (http://www.statmt.org/moses/?n=Moses.AdvancedFeatures#ntoc4), but this
> doesn't address how to treat these tags during tokenization/recasing.
> 
> Achim 
> 
> 
> -Original Message-
> From: moses-support-boun...@mit.edu [mailto:moses-support-boun...@mit.edu]
> On Behalf Of Taylor Rose
> Sent: Thursday, September 08, 2011 11:30 AM
> To: moses-support@mit.edu
> Subject: [Moses-support] Ignoring Symbols?
> 
> Hello,
> 
> I've recently started working with Moses as part of my new internship.
> The company I work for uses in-house formatting tags on documents. (ie.
> paragraph, bold, indent, etc.) Is there a way I can make Moses ignore
> these and keep them in the correct position after translation? My first
> thoughts were to somehow tell Moses that  in English should
> translate to  in Spanish but I haven't found a way to do this if
> it is even possible.
> 
> I'm still learning Moses so please hold off on the RTFMs. The website is
> huge and I've only scratched the surface of the documentation. I would
> appreciate any links you could provide to relevant documents.
> 
> Thanks,

___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] small bug in n-best output of moses_chart

2011-09-09 Thread Hieu Hoang

yep, i fixed it about an hour ago.
   
http://mosesdecoder.svn.sourceforge.net/viewvc/mosesdecoder?view=revision&revision=4209

Not sure how it crept in. Sorry

On 09/09/2011 17:34, Holger Schwenk wrote:


Hello,

I spotted a minor bug in line 387 of 
moses-chart-cmd/src/IOWrapper.cpp, revison 4200


one need to change
 out << "w: ";
to
  out << " w: ";

e.g. a space before "w:". Logically, we should do the same for "lm:", 
but this has no consequences.


This corrects the wrong output of n-best lists:

0 ||| White House promoting as soon as possible to send ??? 
supervision North Korea closed   ||| lm: -47.9563  tm: -13.0248 
-16.9619 -10.1332 -15.9577 6.99927*8.99907w: -7.38301 *||| -204.975


Can you please correct this in svn ?

Holger



___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


[Moses-support] small bug in n-best output of moses_chart

2011-09-09 Thread Holger Schwenk


Hello,

I spotted a minor bug in line 387 of moses-chart-cmd/src/IOWrapper.cpp, 
revison 4200


one need to change
 out << "w: ";
to
  out << " w: ";

e.g. a space before "w:". Logically, we should do the same for "lm:", 
but this has no consequences.


This corrects the wrong output of n-best lists:

0 ||| White House promoting as soon as possible to send ??? supervision 
North Korea closed   ||| lm: -47.9563  tm: -13.0248 -16.9619 
-10.1332 -15.9577 6.99927*8.99907w: -7.38301 *||| -204.975


Can you please correct this in svn ?

Holger

___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] Moses git repository (was Sparse phrase table?)

2011-09-09 Thread Christian Federmann
Hi Barry, all,
 
Then take "moses-smt" ;)
 
I can also recommend switching to GitHub...
 
Cheers,
   Christian
 
 

Barry Haddow  hat am 9. September 2011 um 15:44
geschrieben:

 > Hi Christian
 >
 > Thanks for the tips.
 >
 > Yes, I think that creating an organisation would be the way to go. 
 > Then we could host other related projects in the organisation, rather 
 > than being inside mosesdecoder.
 >
 > Only problem is that 'moses' is already taken
 >
 > cheers - Barry
 >
 > Quoting Christian Rishøj Jensen  on Fri, 9 Sep 
 > 2011 10:51:11 +0200:
 >
 > >
 > > Hi Barry
 > >
 > > A few comments for consideration when deciding:
 > >
 > > I think the recently introduced "organizations" feature addresses 
 > > the concerns about supporting multiple administrators.
 > > https://github.com/blog/674-introducing-organizations
 > >
 > > Moses could be hosted under its own name at e.g. 
 > > https://github.com/moses/moses, as is the case with e.g. stuff like 
 > > https://github.com/webpy/webpy
 > >
 > > For commit emails, you would probably need to run something like 
 > > https://github.com/adamhjk/github-commit-email on a machine 
 > > somewhere and specify the location as post-receive hook on GitHub:
 > > http://help.github.com/post-receive-hooks/
 > >
 > > Still free for open source.
 > >
 > > Best
 > > Christian
 > >
 > > On Sep 8, 2011, at 4:10 PM, Barry Haddow wrote:
 > >
 > >> Hi Christian
 > >>
 > >> We haven't made a final decision on which git host to use. We thought
 > >> github was faster, and had a more polished web interface. But there
 > >> were a couple of important things that it didn't appear to support:
 > >>
 > >>  - multiple administrators
 > >>  - commit emails
 > >>
 > >> Also, the url would include the username, e.g.
 > >> https://github.com/obo/mosesdecoder, which isn't really what we want,
 > >>
 > >> cheers - Barry
 > >>
 > >>
 > >> Quoting Christian Rishøj Jensen  on Thu, 8 Sep
 > >> 2011 15:10:30 +0200:
 > >>
 > >>>
 > >>> Dear Barry,
 > >>>
 > >>> While I'm not a committer to Moses, I am in favor of Git in general.
 > >>> But why not go all the way and host the repository at GitHub? The
 > >>> added collaboration tools could prove to be valuable and conducive
 > >>> for new contributions, as well as increase the transparency of the
 > >>> development process.
 > >>>
 > >>> A short piece on the benefits:
 > >>> http://bdethics.blogspot.com/2011/03/why-github-is-taking-over-universe.html
 > >>>
 > >>> Best
 > >>> Christian
 > >>>
 > >>> On Sep 7, 2011, at 3:07 PM, Barry Haddow wrote:
 > >>>
 >  Hi Lane
 > 
 >  So the secret's out...
 > 
 >  This is a mirror of the current svn repository. Updating is currently
 >  a manual process, kicked off by either me or Ondrej, but it may get
 >  automated.
 > 
 >  Moses development will probably move to git soonish, have your say in
 >  the doodle poll if you have an opinion about it.
 >  http://www.doodle.com/dgnnzdu697yxnhve
 > 
 >  At the moment, you're free to create new branches in the git
 >  repository, but please don't push to any branches which track svn
 >  branches. Otherwise BAD things will happen,
 > 
 >  cheers - Barry
 > 
 > 
 >  Quoting Lane Schwartz  on Wed, 7 Sep 2011
 >  08:54:55 -0400:
 > 
 > > Barry,
 > >
 > > I wasn't aware that there was a Moses git repository. Is it just a
 > > mirror of the subversion repo? Are there plans to move primary
 > > development to the git repo?
 > >
 > > Thanks,
 > > Lane
 > >
 > >
 > > On Wed, Sep 7, 2011 at 3:49 AM, Barry Haddow
 > >  wrote:
 > >> Hi Anne
 > >>
 > >> Yes, there is a version of moses which supports sparse features. It is
 > >> available in the moses git repository
 > >> http://sourceforge.net/scm/?type=git&group_id=171520
 > >> Look for the miramerge branch.
 > >>
 > >> It was sync'ed from trunk within the last few weeks, and I hope to
 > >> keep it reasonably up-to-date -  it may eventually get merged back.
 > >>
 > >> Whilst this branch is failrly experimental, it does work, and has been
 > >> used in experiments. I'm currently trying to improve the sparse
 > >> feature code, so that it will use a dense representation of the core
 > >> features. There are actually implementations of most, if not all, of
 > >> the Chiang features, although I don't know if they're all checked into
 > >> the repository.
 > >>
 > >> cheers - Barry
 > >>
 > >>
 > >>
 > >> Quoting Anne Schuth  on Wed, 7 Sep 2011
 > >> 09:34:19 +0200:
 > >>
 > >>> Hi all,
 > >>>
 > >>> We are in the process of reimplementing some of the 11,001 new
 > >>> features of
 > >>> the Chiang et al. 2009 paper. We are adding a few thousand
 > >>> features to our
 > >>> phrase table, causing it to blow up significantly. For tuning 
 > >>> purposes we
 > >>> filte

Re: [Moses-support] Moses git repository (was Sparse phrase table?)

2011-09-09 Thread Barry Haddow
Hi Christian

Thanks for the tips.

Yes, I think that creating an organisation would be the way to go.  
Then we could host other related projects in the organisation, rather  
than being inside mosesdecoder.

Only problem is that 'moses' is already taken

cheers - Barry

Quoting Christian Rishøj Jensen  on Fri, 9 Sep  
2011 10:51:11 +0200:

>
> Hi Barry
>
> A few comments for consideration when deciding:
>
> I think the recently introduced "organizations" feature addresses  
> the concerns about supporting multiple administrators.
> https://github.com/blog/674-introducing-organizations
>
> Moses could be hosted under its own name at e.g.  
> https://github.com/moses/moses, as is the case with e.g. stuff like  
> https://github.com/webpy/webpy
>
> For commit emails, you would probably need to run something like  
> https://github.com/adamhjk/github-commit-email on a machine  
> somewhere and specify the location as post-receive hook on GitHub:
> http://help.github.com/post-receive-hooks/
>
> Still free for open source.
>
> Best
> Christian
>
> On Sep 8, 2011, at 4:10 PM, Barry Haddow wrote:
>
>> Hi Christian
>>
>> We haven't made a final decision on which git host to use. We thought
>> github was faster, and had a more polished web interface. But there
>> were a couple of important things that it didn't appear to support:
>>
>>  - multiple administrators
>>  - commit emails
>>
>> Also, the url would include the username, e.g.
>> https://github.com/obo/mosesdecoder, which isn't really what we want,
>>
>> cheers - Barry
>>
>>
>> Quoting Christian Rishøj Jensen  on Thu, 8 Sep
>> 2011 15:10:30 +0200:
>>
>>>
>>> Dear Barry,
>>>
>>> While I'm not a committer to Moses, I am in favor of Git in general.
>>> But why not go all the way and host the repository at GitHub? The
>>> added collaboration tools could prove to be valuable and conducive
>>> for new contributions, as well as increase the transparency of the
>>> development process.
>>>
>>> A short piece on the benefits:
>>> http://bdethics.blogspot.com/2011/03/why-github-is-taking-over-universe.html
>>>
>>> Best
>>> Christian
>>>
>>> On Sep 7, 2011, at 3:07 PM, Barry Haddow wrote:
>>>
 Hi Lane

 So the secret's out...

 This is a mirror of the current svn repository. Updating is currently
 a manual process, kicked off by either me or Ondrej, but it may get
 automated.

 Moses development will probably move to git soonish, have your say in
 the doodle poll if you have an opinion about it.
 http://www.doodle.com/dgnnzdu697yxnhve

 At the moment, you're free to create new branches in the git
 repository, but please don't push to any branches which track svn
 branches. Otherwise BAD things will happen,

 cheers - Barry


 Quoting Lane Schwartz  on Wed, 7 Sep 2011
 08:54:55 -0400:

> Barry,
>
> I wasn't aware that there was a Moses git repository. Is it just a
> mirror of the subversion repo? Are there plans to move primary
> development to the git repo?
>
> Thanks,
> Lane
>
>
> On Wed, Sep 7, 2011 at 3:49 AM, Barry Haddow
>  wrote:
>> Hi Anne
>>
>> Yes, there is a version of moses which supports sparse features. It is
>> available in the moses git repository
>> http://sourceforge.net/scm/?type=git&group_id=171520
>> Look for the miramerge branch.
>>
>> It was sync'ed from trunk within the last few weeks, and I hope to
>> keep it reasonably up-to-date -  it may eventually get merged back.
>>
>> Whilst this branch is failrly experimental, it does work, and has been
>> used in experiments. I'm currently trying to improve the sparse
>> feature code, so that it will use a dense representation of the core
>> features. There are actually implementations of most, if not all, of
>> the Chiang features, although I don't know if they're all checked into
>> the repository.
>>
>> cheers - Barry
>>
>>
>>
>> Quoting Anne Schuth  on Wed, 7 Sep 2011
>> 09:34:19 +0200:
>>
>>> Hi all,
>>>
>>> We are in the process of reimplementing some of the 11,001 new
>>> features of
>>> the Chiang et al. 2009 paper. We are adding a few thousand
>>> features to our
>>> phrase table, causing it to blow up significantly. For tuning  
>>> purposes we
>>> filter the table to only include phrases used by our tuning  
>>> dataset which
>>> brings the size on disk down to about 200MB (gzipped). However,
>>> as soon as
>>> we load this table into memory with Moses, it takes more than
>>> 60GB. This is
>>> not really a surprise I guess since Moses will represent all our 0's as
>>> floating points, but it is a problem since not all machines I
>>> would like to
>>> run this on have that much memory.
>>> This leads to my question: does Moses support some form of sparse
>>> representation of phrase table

Re: [Moses-support] Error encountered in make World (srilm installation)

2011-09-09 Thread Taylor Rose
Ceslav,

That worked like a charm. Thanks!
-- 
Taylor Rose
Machine Translation Intern
Language Intelligence


On Fri, 2011-09-09 at 09:44 +0200, Česlav Przywara wrote:
> Taylor,
> are you running 64-bit Ubuntu? Cause I've recently compiled SRILM on
> my Ubuntu and probably came across the very same problem. (I don't
> remember the filename, but stubs.h sounds familiar to me.)
> 
> Anyway, I think your problem might be caused by SRILM being compiled
> in 32-bit mode instead of 64-bit. This is likely to be caused by
> incorrect machine type detection (at least it was in my case). Try to
> uncomment line 109 in srilm/sbin/machine-type script and comment out
> the following line, so instead of setting MACHINE_TYPE to i686 the
> script will set it to i686-m64.
> 
> Cheers,
> Česlav
> 
> on 08/09/11 21:18 Taylor Rose said the following: 
> > This is really old but I found a work around and I figure I'd post
> > this to save time for anyone in the future with this problem.
> > 
> > This is in response to a post from 17 Aug 2010:
> > 
> > --
> > Dear Moses Support, 
> > While following the installation instructions (
> > http://www.statmt.org/moses_steps.html), I encountered the following error.
> > This happened while running *make World*.
> > 
> > 
> > make[2]: Entering directory `/home/sxl382/demo/tools/srilm/misc/src'
> > gcc -m32 -mtune=pentium3 -Wall -Wno-unused-variable -Wno-uninitialized
> > -D_FILE_OFFSET_BITS=64-I. -I../../include   -c -g -O3 -o
> > ../obj/i686/option.o option.c
> > In file included from /usr/include/features.h:352,
> >  from /usr/include/stdio.h:28,
> >  from option.c:22:
> > /usr/include/gnu/stubs.h:7:27: error: gnu/stubs-32.h: No such file or
> > directory
> > make[2]: *** [../obj/i686/option.o] Error 1
> > make[2]: Leaving directory `/home/sxl382/demo/tools/srilm/misc/src'
> > make[1]: *** [release-libraries] Error 1
> > make[1]: Leaving directory `/home/sxl382/demo/tools/srilm'
> > make: *** [World] Error 2
> > 
> > 
> > 
> > 
> > Could you please let me know if there is anything I am missing ?
> > 
> > 
> > Thank you very much.
> > 
> > Regards,
> > Shibamouli Lahiri
> > 
> > 
> > So I had this same exact error. I'm running 64-bit Ubuntu 10.10.
> > 
> > If you go to the directory in question you'll see that indeed
> > stubs-32.h does not exist. In it's place is stubs-64.h
> > 
> > Now open stubs.h in your favorite text editor.
> > 
> > You'll see:
> > #if __WORDSIZE == 32
> > # include 
> > #elif __WORDSIZE == 64
> > # include 
> > 
> > Change it to:
> > #if __WORDSIZE == 32
> > # include 
> > #elif __WORDSIZE == 64
> > # include 
> > 
> > Boom. Done.
> > 
> > This allowed me to install srilm.
> > 
> > Does anyone know if this modification is a huge problem? I am no expert
> > with linux so I'm not sure if this change will cause other things to
> > blow up later.
> > -- 
> > Taylor Rose
> > Machine Translation Intern
> > Language Intelligence
> > 
> > 
> > 
> > 
> > ___
> > Moses-support mailing list
> > Moses-support@mit.edu
> > http://mailman.mit.edu/mailman/listinfo/moses-support
> 
> ___
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support


___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] Moses-support Digest, Vol 59, Issue 14

2011-09-09 Thread Andy Way (Applied Language)
Taylor,

> I've recently started working with Moses as part of my new internship.
> The company I work for uses in-house formatting tags on documents. (ie.
> paragraph, bold, indent, etc.) Is there a way I can make Moses ignore
> these and keep them in the correct position after translation? My first
> thoughts were to somehow tell Moses that  in English should
> translate to  in Spanish but I haven't found a way to do this if
> it is even possible.

Have a look at our paper "TMX Markup: A Challenge When Adapting SMT to
the Localisation Environment", presented at EAMT-10, and available at:
http://www.computing.dcu.ie/~away/PUBS/2010/tag_processing_final.pdf

There we found that integrating the tags as part of the t-table
improved translation quality. We were also able to preserve the
correct tag order on the target side, so don't delete these tags too
hastily!
Andy.

-- 
Andy Way
Director of Language Technology
andy@appliedlanguage.com
Skype ID: andy.way_als
Tel: 07808 609107

Applied Language Solutions
High quality language solutions delivered on time...with a smile!

www.appliedlanguage.com
Tel (UK): +44 (0)845 367 7000
Tel (US): +1 (800) 579-5010

Riverside Court, Huddersfield Road, Delph, Oldham, OL3 5FZ. UK
Registered in the UK 5122429

Pride in everything we do | Respect everyone like a friend

An Environmentally Friendly Company. Think of the environment; please
don't print this e-mail unless you really need to.

Fast Track 100 2009 Queens Award for Business
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] Moses git repository (was Sparse phrase table?)

2011-09-09 Thread Christian Rishøj Jensen

Hi Barry

A few comments for consideration when deciding:

I think the recently introduced "organizations" feature addresses the concerns 
about supporting multiple administrators.
https://github.com/blog/674-introducing-organizations

Moses could be hosted under its own name at e.g. 
https://github.com/moses/moses, as is the case with e.g. stuff like 
https://github.com/webpy/webpy

For commit emails, you would probably need to run something like 
https://github.com/adamhjk/github-commit-email on a machine somewhere and 
specify the location as post-receive hook on GitHub:
http://help.github.com/post-receive-hooks/

Still free for open source.

Best
Christian

On Sep 8, 2011, at 4:10 PM, Barry Haddow wrote:

> Hi Christian
> 
> We haven't made a final decision on which git host to use. We thought  
> github was faster, and had a more polished web interface. But there  
> were a couple of important things that it didn't appear to support:
> 
>  - multiple administrators
>  - commit emails
> 
> Also, the url would include the username, e.g.  
> https://github.com/obo/mosesdecoder, which isn't really what we want,
> 
> cheers - Barry
> 
> 
> Quoting Christian Rishøj Jensen  on Thu, 8 Sep  
> 2011 15:10:30 +0200:
> 
>> 
>> Dear Barry,
>> 
>> While I'm not a committer to Moses, I am in favor of Git in general.  
>> But why not go all the way and host the repository at GitHub? The  
>> added collaboration tools could prove to be valuable and conducive  
>> for new contributions, as well as increase the transparency of the  
>> development process.
>> 
>> A short piece on the benefits:  
>> http://bdethics.blogspot.com/2011/03/why-github-is-taking-over-universe.html
>> 
>> Best
>> Christian
>> 
>> On Sep 7, 2011, at 3:07 PM, Barry Haddow wrote:
>> 
>>> Hi Lane
>>> 
>>> So the secret's out...
>>> 
>>> This is a mirror of the current svn repository. Updating is currently
>>> a manual process, kicked off by either me or Ondrej, but it may get
>>> automated.
>>> 
>>> Moses development will probably move to git soonish, have your say in
>>> the doodle poll if you have an opinion about it.
>>> http://www.doodle.com/dgnnzdu697yxnhve
>>> 
>>> At the moment, you're free to create new branches in the git
>>> repository, but please don't push to any branches which track svn
>>> branches. Otherwise BAD things will happen,
>>> 
>>> cheers - Barry
>>> 
>>> 
>>> Quoting Lane Schwartz  on Wed, 7 Sep 2011  
>>> 08:54:55 -0400:
>>> 
 Barry,
 
 I wasn't aware that there was a Moses git repository. Is it just a
 mirror of the subversion repo? Are there plans to move primary
 development to the git repo?
 
 Thanks,
 Lane
 
 
 On Wed, Sep 7, 2011 at 3:49 AM, Barry Haddow
  wrote:
> Hi Anne
> 
> Yes, there is a version of moses which supports sparse features. It is
> available in the moses git repository
> http://sourceforge.net/scm/?type=git&group_id=171520
> Look for the miramerge branch.
> 
> It was sync'ed from trunk within the last few weeks, and I hope to
> keep it reasonably up-to-date -  it may eventually get merged back.
> 
> Whilst this branch is failrly experimental, it does work, and has been
> used in experiments. I'm currently trying to improve the sparse
> feature code, so that it will use a dense representation of the core
> features. There are actually implementations of most, if not all, of
> the Chiang features, although I don't know if they're all checked into
> the repository.
> 
> cheers - Barry
> 
> 
> 
> Quoting Anne Schuth  on Wed, 7 Sep 2011
> 09:34:19 +0200:
> 
>> Hi all,
>> 
>> We are in the process of reimplementing some of the 11,001 new  
>> features of
>> the Chiang et al. 2009 paper. We are adding a few thousand  
>> features to our
>> phrase table, causing it to blow up significantly. For tuning purposes we
>> filter the table to only include phrases used by our tuning dataset which
>> brings the size on disk down to about 200MB (gzipped). However,  
>> as soon as
>> we load this table into memory with Moses, it takes more than  
>> 60GB. This is
>> not really a surprise I guess since Moses will represent all our 0's as
>> floating points, but it is a problem since not all machines I  
>> would like to
>> run this on have that much memory.
>> This leads to my question: does Moses support some form of sparse
>> representation of phrase tables? Or, how is this issue generally  
>> solved, as
>> I am quite sure we are not the first to try this.
>> 
>> Any comments, pointers to documentation are very much appreciated!
>> 
>> Best,
>> Anne
>> 
>> --
>> Anne Schuth
>> ILPS - ISLA - FNWI
>> University of Amsterdam
>> Science Park 904, C3.230
>> 1098 XH AMSTERDAM
>> The Netherlands
>> 0031 (0) 20 525 5357
>> 
> 
> 
> 

Re: [Moses-support] Error encountered in make World (srilm installation)

2011-09-09 Thread Česlav Przywara

Taylor,
are you running 64-bit Ubuntu? Cause I've recently compiled SRILM on my 
Ubuntu and probably came across the very same problem. (I don't remember 
the filename, but stubs.h sounds familiar to me.)


Anyway, I think your problem might be caused by SRILM being compiled in 
32-bit mode instead of 64-bit. This is likely to be caused by incorrect 
machine type detection (at least it was in my case). Try to uncomment 
line 109 in srilm/sbin/machine-type script and comment out the following 
line, so instead of setting MACHINE_TYPE to i686 the script will set it 
to i686-m64.


Cheers,
Česlav

on 08/09/11 21:18 Taylor Rose said the following:
This is really old but I found a work around and I figure I'd post 
this to save time for anyone in the future with this problem.


This is in response to a post from 17 Aug 2010:

--
Dear Moses Support,
While following the installation instructions (
http://www.statmt.org/moses_steps.html), I encountered the following error.
This happened while running *make World*.


make[2]: Entering directory `/home/sxl382/demo/tools/srilm/misc/src'
gcc -m32 -mtune=pentium3 -Wall -Wno-unused-variable -Wno-uninitialized
-D_FILE_OFFSET_BITS=64-I. -I../../include   -c -g -O3 -o
../obj/i686/option.o option.c
In file included from /usr/include/features.h:352,
  from /usr/include/stdio.h:28,
  from option.c:22:
/usr/include/gnu/stubs.h:7:27: error: gnu/stubs-32.h: No such file or
directory
make[2]: *** [../obj/i686/option.o] Error 1
make[2]: Leaving directory `/home/sxl382/demo/tools/srilm/misc/src'
make[1]: *** [release-libraries] Error 1
make[1]: Leaving directory `/home/sxl382/demo/tools/srilm'
make: *** [World] Error 2




Could you please let me know if there is anything I am missing ?


Thank you very much.

Regards,
Shibamouli Lahiri


So I had this same exact error. I'm running 64-bit Ubuntu 10.10.

If you go to the directory in question you'll see that indeed
stubs-32.h does not exist. In it's place is stubs-64.h

Now open stubs.h in your favorite text editor.

You'll see:
#if __WORDSIZE == 32
# include
#elif __WORDSIZE == 64
# include

Change it to:
#if __WORDSIZE == 32
# include
#elif __WORDSIZE == 64
# include

Boom. Done.

This allowed me to install srilm.

Does anyone know if this modification is a huge problem? I am no expert
with linux so I'm not sure if this change will cause other things to
blow up later.
--
Taylor Rose
Machine Translation Intern
Language Intelligence



___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support