I forgot to mention that I also have a prototype for interactive
sentence alignment which you can try here:
http://stp.lingfil.uu.se/~joerg/Uplug/php/isa.php
Download uplug-main and uplug-webalign from
https://bitbucket.org/tiedemann/uplug/downloads
if you like to use it.
Jörg
On Tue, Feb 12, 2
Hi,
Just to clarify corpus preparation for LM estimation:
SRILM & KenLM: do not add or to the corpus. They are added
internally (unless you disable it). So in this case, the LM file should
look like this:
this is a small house .
IRSTLM: decorate the corpus with and before build
Oh yes, that looks very promising. Thanks!
W dniu 12.02.2013 18:12, Joerg Tiedemann pisze:
> Maybe the following tool could be something for you?
> http://wanthalf.saga.cz/intertext
>
> Jörg
>
>
>
>
> On 12 feb 2013, at 17:52, Marcin Junczys-Dowmunt wrote:
>
>> Hi,
>> sort of off-topic, but can s
Although old by now I still find Alpaco useful. If you like I can send you a
script converting Giza++ output to Alpaco files.
http://www.d.umn.edu/~tpederse/parallel.html
/Lars
Från: moses-support-boun...@mit.edu [moses-support-boun...@mit.edu] för
Marcin
Maybe the following tool could be something for you?
http://wanthalf.saga.cz/intertext
Jörg
On 12 feb 2013, at 17:52, Marcin Junczys-Dowmunt wrote:
> Hi,
> sort of off-topic, but can somebody recommend a tool for manual sentence
> alignment or post-editing of automatically aligned sentences?
Yes, John. The solutions look the same. It's very possible I got the
idea from your earlier writings.
We don't always use this technique.
When we do use them, we've rarely find extraneous markers. We use them
more regularly in the recaser data and it helps force first-word casing.
I think they
This sounds like our workaround. Just to make sure I understand, Tom, it
sounds like you add your own extra markers to everything, both for alignment
and language modeling, so the parallel files look like this (using and
instead of your music symbols):
das ist ein kleines haus .
this
Based on last year's eos marker discussions, we started using
alternate sos/eos markers in both parallel and lm corpora. We settled on
two obscure UTF-8 characters U+1D179 Musical Symbol Begin Phrase and
U+1D17A Musical Symbol End Phrase. As in standard corpus preparation,
the parallel corpora d
Le 12/02/2013 16:38, Marcin Junczys-Dowmunt a écrit :
> Hi,
> sort of off-topic, but can somebody recommend a tool for manual sentence
> alignment or post-editing of automatically aligned sentences? I want to
> create a gold standard to for a parallel corpus I am creating and cannot
> really find a
Hi,
sort of off-topic, but can somebody recommend a tool for manual sentence
alignment or post-editing of automatically aligned sentences? I want to
create a gold standard to for a parallel corpus I am creating and cannot
really find anything that looks useful.
Thank,
Best,
Marcin
__
There is also a REST API available as part of the M4Loc project
http://code.google.com/p/m4loc/
that is able to handle inline formatting in sentences.
Achim
> -Original Message-
> From: moses-support-boun...@mit.edu [mailto:moses-support-
> boun...@mit.edu] On Behalf Of Ian Johnson
> S
All,
As part of the MosesCore project we, at Capita TI formerly ALS, have
designed a proposed open standard API for MT systems. This specifies a
service interface rather than an application interface where the intention
was to make MT systems of different flavours look the same. Applications
**
TSD 2013 - FIRST CALL FOR PAPERS
**
Sixteenth International Conference on TEXT, SPEECH and DIALOGUE (TSD 2013)
Plzen (Pi
Tom,
In my opinion specifying a standard interface in REST is a bad thing to
do. It is tying applications down at a high level. REST, with HTTP, is
specifying the interface and, more importantly, the transport. Some
applications may not wish to use HTTP. Moreover, in an industrial
deployme
Hi Hieu and moses-support,
Sorry for the missing detail, but I forgot to say that I used also
--translation-factors 0-0 option at training step.
mosesdecoder/scripts/training/train-model.perl --mgiza --external-bin-dir
/usr/local/bin/ --corpus factored-corpus/proj-syndicate.1000 --root-dir
unfact
Hi Tom
As far as I know, REST
(http://en.wikipedia.org/wiki/Representational_state_transfer) is a type
of software architecture, rather than an API, and I think the XML-RPC
interface could be considered RESTful.
As you're probably aware, the Moses server alone isn't sufficient to
create a "tr
16 matches
Mail list logo