please subscribe to the mailing list before posting to it. You can
subscribe here:
   http://mailman.mit.edu/mailman/listinfo/moses-support

I don't really understand your questions. All characters are taken into
account by the decoder and the training algorithms.

There are some reserved characters that you must not use - [ ] | < > &

You might want to put something into a xml tag, eg. <private-xml
ff="....">. I think then the decoder will ignore it


On 18 March 2014 14:20, <moses-support-ow...@mit.edu> wrote:

> As list administrator, your authorization is requested for the
> following mailing list posting:
>
>     List:    Moses-support@mit.edu
>     From:    arnaud.gicq...@linguacustodia.com
>     Subject: Issue about sentence segmentation
>     Reason:  Post by non-member to a members-only list
>
> At your convenience, visit:
>
>     http://mailman.mit.edu/mailman/admindb/moses-support
>
> to approve or deny the request.
>
>
> ---------- Forwarded message ----------
> From: Arnaud Gicquel <arnaud.gicq...@linguacustodia.com>
> To: moses-support@MIT.EDU
> Cc:
> Date: Tue, 18 Mar 2014 15:20:38 +0100
> Subject: Issue about sentence segmentation
>
> Hi all
>
> I am trying to develop a specific segmenter. The goal is to send Moses
> decoder sentences instead of large textual flows syntactically incoherent.
> In order to integrate this segmenter in an automatic workflow of documents
> translation. I would define as a sentence delimiter any character that the
> decoder does not take into account in its statistics.
>
> Is there a completely neutral character (I don't want it to be considered
> unknown) that I could use as a delimiter?
>
> Thank you for your help
>
> Arnaud Gicquel
>
> --
> Lingua Custodia
> 1, Place Charles de Gaulle
> 78180 Montigny le Bretonneux - France
> Tel : 33 1 30 44 04 23
> Email :  arnaud.gicq...@linguacustodia.com
> Website :  www.linguacustodia.com
>
>
> ---------- Forwarded message ----------
> From: moses-support-requ...@mit.edu
> To:
> Cc:
> Date:
> Subject: confirm d2337e56c58e533c4286cc73dbb52d9352c94e86
> If you reply to this message, keeping the Subject: header intact,
> Mailman will discard the held message.  Do this if the message is
> spam.  If you reply to this message and include an Approved: header
> with the list password in it, the message will be approved for posting
> to the list.  The Approved: header can also appear in the first line
> of the body of the reply.
>



-- 
Hieu Hoang
Research Associate
University of Edinburgh
http://www.hoang.co.uk/hieu
_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to