[Moses-support] Looking for Position

2021-02-01 Thread amir haghighi
Hi Everyone,

(I apologize for sending this message to this list)

I am experienced in Machine Translation and have10 years experience in this
field. I'm currently seeking a postDoc position in MT or NLP.
Could anyone please guide  me how I can find a  proper position? Is there
any other mailing list except the corpora list for finding postdoc
positions? Is it convenient to send email to the professors and ask about
the vacancies or all of the vacancies will be officially announced?

I appreciate any help you can provide.
Thanks
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


[Moses-support] Google SMT system

2020-08-25 Thread amir haghighi
Hi everyone

I was wondering if the old version of Google Translator (SMT version) can
be accessed somewhere? I need the results of a SMT system for my research.
I found some apks but they can't be installed on new phones.
I'd be appreciated if you could help me.


Regards
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


[Moses-support] Google SMT translator

2020-08-02 Thread amir haghighi
Hi everyone

I was wondering if the old version of Google Translator (SMT version) can
be accessed somewhere?

Regards
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] Free cloud service to train NMT

2018-06-08 Thread amir haghighi
Thanks Hieu.

Is it possible to train the NMT system on the laptop? I mean is there any
laptop that I can buy for this purpose?

Thanks

On Fri, Jun 8, 2018 at 8:36 PM, Hieu Hoang  wrote:

> try this
>   https://developer.nvidia.com/academic_gpu_seeding
> or search the web
>
> Hieu Hoang
>
> On 8 June 2018 at 14:14, amir haghighi  wrote:
>
>> Hello
>>
>> I'm going to set up an NMT system using openNMT or Nematus but I can't
>> run it on my laptop and I don't have access to any cluster.
>> I was wondering if there is any free cloud computing service which can be
>> used to setup a full-size state of the art NMT system?
>>
>> Thanks
>>
>>
>> ___
>> Moses-support mailing list
>> Moses-support@mit.edu
>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>
>>
>
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


[Moses-support] Free cloud service to train NMT

2018-06-08 Thread amir haghighi
Hello

I'm going to set up an NMT system using openNMT or Nematus but I can't run
it on my laptop and I don't have access to any cluster.
I was wondering if there is any free cloud computing service which can be
used to setup a full-size state of the art NMT system?

Thanks
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] SMT decoding complexity

2017-02-27 Thread amir haghighi
Thanks Philipp,

Yes, my formula has exponential terma but as the growth of factorial is
greater than the growth of exponential, the complexity of the decoding
algorithm(without any constraint on reordering or pruning the search space)
is of O(n!), right?

Thanks Barry, I'll read the paper.

On Mon, Feb 27, 2017 at 7:24 PM, Philipp Koehn  wrote:

> Hi,
>
> I am not sure if you follow your question - in the formula you cite,
> there are exponential terms: 2^n and T^n.
>
> The Knight paper is worth trying to understand (it's on IBM Models,
> but applies similarly to phrase-based models).
>
> Also keep in mind that limited reordering windows and beam search
> makes actual decoding algorithm implementations linear.
>
> -phi
>
> On Sun, Feb 26, 2017 at 1:16 PM, amir haghighi
>  wrote:
> > Hi all,
> >
> > In the Moses manual and also in SMT textbooks it is mentioned that the
> > decoding complexity for PB-SMT is exponential in the source sentence
> length.
> > If we have a source sentence with length n, in decoding by hypothesis
> > expansion, we have power(2,n) state, each of them can be reordered in n!
> > orders, and each state can be translated in power(T,n), where T is the
> > number of translation options, right?
> > so the decoder complexity is power(2,n)*n!*power(T,n), so why its
> mentioned
> > that the complexity is exponential?
> >
> > Could someone please explain for me how the decoder complexity is
> > calculated?
> > I've read the Knight(1999) paper, but I couldn't understand it. Could you
> > please introduce another reference?
> >
> > Thanks
> >
> >
> > ___
> > Moses-support mailing list
> > Moses-support@mit.edu
> > http://mailman.mit.edu/mailman/listinfo/moses-support
> >
>
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


[Moses-support] SMT decoding complexity

2017-02-26 Thread amir haghighi
Hi all,

In the Moses manual and also in SMT textbooks it is mentioned that the
decoding complexity for PB-SMT is exponential in the source sentence
length. If we have a source sentence with length n, in decoding by
hypothesis expansion, we have power(2,n) state, each of them can be
reordered in n! orders, and each state can be translated in power(T,n),
where T is the number of translation options, right?
so the decoder complexity is power(2,n)*n!*power(T,n), so why its mentioned
that the complexity is exponential?

Could someone please explain for me how the decoder complexity is
calculated?
I've read the Knight(1999) paper, but I couldn't understand it. Could you
please introduce another reference?

Thanks
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


[Moses-support] Fwd: Significant Test

2017-01-31 Thread amir haghighi
Hi All

Could someone please explain for me what does "significant test" and
"p-value" mean?
I've read both Koehn
's
and Clark 's papers
on significant test but I still don't know what does a p-value such as 0.05
means. Does it mean that if the difference between the scores of two
systems is x, with probability=95%, if we repeat the experiments we will
get the same difference?
Does significant test deal with randomness of tuning process? or it deals
with test set selection?

Any help would be appreciated
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] please help me with the code - getting word index

2015-06-20 Thread amir haghighi
Thanks Matthias
ChartHypothesis::GetCurrSourceRange() gets the source span that all
terminals and non terminals in the current hypothesis cover in the source
sentence. I'd like to know which terminals (non terminals) are corresponded
to which source word's index in the source. Could you guide me how to
obtain that?

Thanks again


On Thu, Jun 18, 2015 at 9:48 PM, Matthias Huck  wrote:

> Hi,
>
> You can calculate absolute positions in the source sentence based on the
> words range of the current hypothesis and those of the direct
> predecessors (in case of right-hand side non-terminals).
>
> Take a look at these methods:
>
> InputPath::GetWordsRange()
> ChartHypothesis::GetCurrSourceRange()
> ChartCellLabel::GetCoverage()
>
> Cheers,
> Matthias
>
>
> On Thu, 2015-06-18 at 20:23 +0430, amir haghighi wrote:
> > Hi everybody
> >
> >
> > I wrote the following code to get an ordered list from the source words
> > inside a hypothesis. It gets the words in their translation order, but I
> > need not only the words' strings, but also the index of each word in  the
> > original sentence.
> >
> > could you please help me how to get the index of each word in srcPhrase,
> in
> > the sentence?
> >
> >
> > void Amir::GetSourcePhrase2(const ChartHypothesis&  cur_hypo,Phrase
> > &srcPhrase) const
> > {
> > AmirUtils utility;
> > TargetPhrase targetPh=cur_hypo.GetCurrTargetPhrase();
> > const Phrase *sourcePh=targetPh.GetRuleSource();
> >  int targetWordsNum=cur_hypo.GetCurrTargetPhrase().GetSize();
> > std::vector  source, orderedSource;
> > std::vector  alignmentVector;
> > std::vector  isAligned;
> >
> > std::vector  > sourcePosSets;
> >
> > for(int targetP=0; targetP< targetWordsNum; targetP++ ){
> > //std::cerr<<"setting alignments for targetword:
> "< >
> >
> sourcePosSets.push_back(cur_hypo.GetCurrTargetPhrase().GetAlignTerm().GetAlignmentsForTarget(targetP));
> > }
> >
> >
> > for(int ii=targetWordsNum-1; ii>=0; ii--){
> > std::set  cur_srcPosSet=sourcePosSets[ii];
> > for (std::set ::const_iterator alignmet =
> > cur_srcPosSet.begin();alignmet != cur_srcPosSet.end(); ++alignmet) {
> > int  alignmentElement=*alignmet;
> > for(int index=0; index and
> > remove the othres
> > //remove it from the list
> > if(sourcePosSets[index].size()>0){
> > //std::cerr<<" removing "<<*alignmet< > //std::cerr<<"  for set with size:
> > "< > sourcePosSets[index].erase(alignmentElement);
> > }
> >
> > }
> > }
> > }
> >
> > for (size_t posT = 0; posT < cur_hypo.GetCurrTargetPhrase().GetSize();
> > ++posT) {
> >   const Word &word = cur_hypo.GetCurrTargetPhrase().GetWord(posT);
> >   if (word.IsNonTerminal()){
> > // non-term. fill out with prev hypo
> >
> > size_t nonTermInd =
> >
> cur_hypo.GetCurrTargetPhrase().GetAlignNonTerm().GetNonTermIndexMap()[posT];
> > const ChartHypothesis *prevHypo =
> cur_hypo.GetPrevHypo(nonTermInd);
> >
> > GetSourcePhrase2(*prevHypo,srcPhrase);
> > }
> >   else{
> >
> >   for(std::set::const_iterator
> > it=sourcePosSets[posT].begin();it !=  sourcePosSets[posT].end() ;
> it++
> > ){
> >   srcPhrase.AddWord(sourcePh->GetWord(*it));
> >   }
> >   }
> > }
> >
> >
> > }
> > ___
> > Moses-support mailing list
> > Moses-support@mit.edu
> > http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
>
> --
> The University of Edinburgh is a charitable body, registered in
> Scotland, with registration number SC005336.
>
>
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


[Moses-support] please help me with the code - getting word index

2015-06-18 Thread amir haghighi
Hi everybody


I wrote the following code to get an ordered list from the source words
inside a hypothesis. It gets the words in their translation order, but I
need not only the words' strings, but also the index of each word in  the
original sentence.

could you please help me how to get the index of each word in srcPhrase, in
the sentence?


void Amir::GetSourcePhrase2(const ChartHypothesis&  cur_hypo,Phrase
&srcPhrase) const
{
AmirUtils utility;
TargetPhrase targetPh=cur_hypo.GetCurrTargetPhrase();
const Phrase *sourcePh=targetPh.GetRuleSource();
 int targetWordsNum=cur_hypo.GetCurrTargetPhrase().GetSize();
std::vector  source, orderedSource;
std::vector  alignmentVector;
std::vector  isAligned;

std::vector  > sourcePosSets;

for(int targetP=0; targetP< targetWordsNum; targetP++ ){
//std::cerr<<"setting alignments for targetword: "<=0; ii--){
std::set  cur_srcPosSet=sourcePosSets[ii];
for (std::set ::const_iterator alignmet =
cur_srcPosSet.begin();alignmet != cur_srcPosSet.end(); ++alignmet) {
int  alignmentElement=*alignmet;
for(int index=0; index0){
//std::cerr<<" removing "<<*alignmet<::const_iterator
it=sourcePosSets[posT].begin();it !=  sourcePosSets[posT].end() ; it++
){
  srcPhrase.AddWord(sourcePh->GetWord(*it));
  }
  }
}


}
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


[Moses-support] compare hypothesis in moses chart

2014-12-17 Thread amir haghighi
Hi everyone

I need to implement the compare function for hiero model. I was wondering
which hypothesis are going to be compared with each other? those hypos that
covers the same source spans (for example all hypos that cover [x y]
spans)? or those ones that covers the source spans with the same length( [x
x+d] spans)?

Cheers
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] string of Words + states in feature functions

2014-12-16 Thread amir haghighi
Hoang, Hieu and Matthias,Thank you all so much, your explanations really
helped me.
Hieu, could you please add your explanations about compare function to
adding-feature-function web page?

I'm confused about how compare function works. which hypothesis are going
to be compared with each other? those hypos that covers the same source
spans (for example all hypos that cover [x y] spans)? or those ones that
covers the source spans with the same length( [x x+d] spans)?

Cheers






On Wed, Dec 10, 2014 at 2:02 PM, Matthias Huck  wrote:
>
> Hi Amir,
>
> The input is passed to the feature functions via
> InitializeForInput(InputType const& source).
> This method is called before search and collecting of translation
> options (cf. moses/FF/FeatureFunction.h). You can set a member variable
> to have access to the input in your scoring method.
>
> Alternatively, if you implement EvaluateWithSourceContext(), the input
> is passed directly to the method as a parameter (const InputType &input)
> and you can use that.
> Finally, there's another option in the EvaluateWhenApplied() methods.
> You can get the input from the Hypothesis object:
> const InputType& input = hypo.GetManager().GetSource();
>
> The input is an InputType object. Moses knows different input types, see
> InputTypeEnum in moses/TypeDef.h . So what you get might differ
> depending on what was passed to the decoder. If you're happy with
> implementing your feature for sentence input only, then you can cast the
> input to a Sentence object. The Sentence object gives you convenient
> access methods, in particular GetSize() and GetWord(size_t pos). You can
> thus obtain the sequence of words in the input. "Words" can contain
> several factors in Moses. The factor with index 0 is typically the
> surface form. Access it using the [] operator.
>
> I guess you will never really want to work directly with the string
> representation of the factor, but at this point you would be able to get
> it and for instance print it to your debug output.
>
> Hope this was helpful as another answer to your first question.
>
> Cheers,
> Matthias
>
>
> On Wed, 2014-12-10 at 11:41 +0330, amir haghighi wrote:
> > Hi everyone
> >
> >
> >
> > I'm implementing a feature function in moses-chart. I need the source
> > words string and also their indexes in the source sentence. I've
> > written a function that gets the source words but I don't know how
> > extract word string from a word.
> > could anyone guide me how to do that? as I know, each word is
> > implemented as an array of factors, which of them is its string?
> >
> >
> > I have also some questions about the states in the stateful features,
> > what kind of variables should be stored in each state? only those ones
> > that should be used in the compare function? or any variable from the
> > previous hypothesis  that we use in our feature?
> >
> >
> > Thanks in advance!
> >
> >
> > Cheers
> >
> > Amir
> >
> > ___
> > Moses-support mailing list
> > Moses-support@mit.edu
> > http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
>
> --
> The University of Edinburgh is a charitable body, registered in
> Scotland, with registration number SC005336.
>
>
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


[Moses-support] string of Words + states in feature functions

2014-12-10 Thread amir haghighi
Hi everyone

I'm implementing a feature function in moses-chart. I need the source words
string and also their indexes in the source sentence. I've written a
function that gets the source words but I don't know how extract word
string from a word.
could anyone guide me how to do that? as I know, each word is implemented
as an array of factors, which of them is its string?

I have also some questions about the states in the stateful features,
what kind of variables should be stored in each state? only those ones that
should be used in the compare function? or any variable from the previous
hypothesis  that we use in our feature?

Thanks in advance!

Cheers
Amir
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] stateless or stateful?!

2014-10-14 Thread amir haghighi
Thanks so much Hieu for your reply.

I mean, if I call GetOutputPhrase() for hypothesis number 11 in the
following example, what will it return? will it return "a house" or I
should follow the previous hypothesis of 11 (in this case, 7 and 9) and
then call GetOutputPhrase() for each of them to obtain the target phrase?
or maybe I should store something in the states! I don't understand what
type of things I should store in the states.

 41 X TOP ->  S  (1,1) [0..5] -3.593 <<0.000,
-2.606, -9.711, 2.526>> 20
 20 X S   -> NP V NP(0,0) (1,1) (2,2) [1..4] -1.988 <<0.000,
-1.737, -6.501, 2.526>> 3 5 11
  3 X NP  -> this [1..1]  0.486 <<0.000,
-0.434, -1.330, 2.303>>
  5 X V   -> is   [2..2] -1.267 <<0.000,
-0.434, -2.533, 0.000>>
 11 X NP  -> DT NN  (0,0) (1,1)   [3..4] -2.698 <<0.000,
-0.869, -5.396, 0.000>> 7 9
  7 X DT  -> a[3..3] -1.012 <<0.000,
-0.434, -2.024, 0.000>>
  9 X NN  -> house[4..4] -2.887 <<0.000,
-0.434, -5.774, 0.000>>


and another important question: How should I use my feature function by
running chart decoder with Ems? my feature doesn't need any argument.
I added the following lines to my config file, is it right?!

   [tune]

  decoder-settings = "-threads 8 -feature-add  -weight-add "abc= 0.1""

   [EVALUATION]
   decoder-settings = "-search-algorithm 1 -cube-pruning-pop-limit
5000 -s 5000 -threads 8 -feature-add
-weight-add "abc= 0.1" "



On Mon, Oct 13, 2014 at 11:44 AM, Hieu Hoang  wrote:

> hi amir
>
> On 12 October 2014 21:36, amir haghighi 
> wrote:
>
>> Hi everyone
>>
>> I'm gonna to add a feature function to Hiero model in Moses and I have
>> some questions about the moses code. any response would be much appreciated
>> :)
>>
>> for implementing my feature function to assign score to each hypothesis,
>> I need: the whole source sentence
>> the words in source sentence that are translated so far
>> target words that are produced so far
>>
>> 1. my feature function should be stateless or stateful? can I implement
>> it as a stateless function and extract needed information from this vector?
>> std::vector m_prevHypos;
>>
> If you want to access m_prevHypos, then it should be a stateful feature.
> The Moses feature function framework allows stateless feature to also do
> this, but that isn't really correct, it risk making random errors.
>
>>
>> 2. which of these evaluate functions should be implemented?
>> EvaluateWhenApplied(  const ChartHypothesis& /* cur_hypo */,  int /*
>> featureID - used to index the state in the previous hypotheses */,
>> ScoreComponentCollection* accumulator) const  or
>> void SkeletonStatefulFF::EvaluateWithSourceContext(const InputType &input
>> , const InputPath &inputPath  , const TargetPhrase &targetPhrase  , const
>> StackVec *stackVec  , ScoreComponentCollection &scoreBreakdown  ,
>> ScoreComponentCollection *estimatedFutureScore) const
>>
> You should use EvaluateWhenApplied(). The hypotheses, and the variable
> m_prevHypos, is available in this function.
>
>>
>> 3. In hierarchical translation, is there any gap between produced target
>> phrases? How can I get produced target phrases in the code?
>>
> i don't understand your question. Can you give an example?
>
>>
>>
>>  Thanks in advance
>> Amir
>>
>>
>>
>>
>>
>>
>>
>> ___
>> Moses-support mailing list
>> Moses-support@mit.edu
>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>
>>
>
>
> --
> Hieu Hoang
> Research Associate
> University of Edinburgh
> http://www.hoang.co.uk/hieu
>
>
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


[Moses-support] stateless or stateful?!

2014-10-12 Thread amir haghighi
Hi everyone

I'm gonna to add a feature function to Hiero model in Moses and I have some
questions about the moses code. any response would be much appreciated :)

for implementing my feature function to assign score to each hypothesis, I
need: the whole source sentence
the words in source sentence that are translated so far
target words that are produced so far

1. my feature function should be stateless or stateful? can I implement it
as a stateless function and extract needed information from this vector?
std::vector m_prevHypos;

2. which of these evaluate functions should be implemented?
EvaluateWhenApplied(  const ChartHypothesis& /* cur_hypo */,  int /*
featureID - used to index the state in the previous hypotheses */,
ScoreComponentCollection* accumulator) const  or
void SkeletonStatefulFF::EvaluateWithSourceContext(const InputType &input  ,
const InputPath &inputPath  , const TargetPhrase &targetPhrase  , const
StackVec *stackVec  , ScoreComponentCollection &scoreBreakdown  ,
ScoreComponentCollection *estimatedFutureScore) const
3. In hierarchical translation, is there any gap between produced target
phrases? How can I get produced target phrases in the code?


 Thanks in advance
Amir
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] Fwd: about the moses code

2014-07-14 Thread amir haghighi
Thank you very much Mr Hieu for uploading this helpful video.
unfortunately some part of it, are corrupted( especially the first
minutes), and  what you are typing in the terminal can not be seen.
It would be great if you could upload a better one.
Thank you again.

Amir


On Mon, Jul 14, 2014 at 2:46 AM, Hieu Hoang  wrote:

> I've just uploaded a youtube video about this
>https://www.youtube.com/watch?v=P43h827uLac&feature=youtu.be
> Hope thats useful to you
>
>
>
> On 13 July 2014 22:53, amir haghighi  wrote:
>
>>
>> Hello all
>>
>> it is a week that I want to open moses code with Netbeans od eclipse IDE
>> and I cant.
>> regrading that moses code does not have any make or configure file, could
>> you please help me how can I open and run its code with those IDEs?
>> I should add my feature function to moses code but I can't even open the
>> code with an IDE.
>>
>> I would be very grateful if you could help me.
>>
>> Thank you
>> Amir
>>
>>
>> ___
>> Moses-support mailing list
>> Moses-support@mit.edu
>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>
>>
>
>
> --
> Hieu Hoang
> Research Associate
> University of Edinburgh
> http://www.hoang.co.uk/hieu
>
>
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


[Moses-support] Fwd: about the moses code

2014-07-13 Thread amir haghighi
Hello all

it is a week that I want to open moses code with Netbeans od eclipse IDE
and I cant.
regrading that moses code does not have any make or configure file, could
you please help me how can I open and run its code with those IDEs?
I should add my feature function to moses code but I can't even open the
code with an IDE.

I would be very grateful if you could help me.

Thank you
Amir
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


[Moses-support] How to start coding in moses

2014-05-03 Thread amir haghighi
Hi
I want to implement a new reordering model (as a feature function) for
moses. I need the whole sentence which moses is translating now, and
the previous phrases which are translated. I am not familiar with
moses code and it seems very complicated to me. I had took a look at
code documentation but I don't know which classes should I study. I
would be very thankful if everyone could help me.
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] tuning takes tooooo long:(

2014-02-18 Thread amir haghighi
Thank you Barry for the explanation.
The problem was solved and I finally obtained the bleu score.

Regards
Amir




On Mon, Feb 17, 2014 at 12:39 AM, Barry Haddow
wrote:

>  HI Amir
>
> You should be able to do the building of the LM with irstlm, and
> binarising and decoding with KenLM, all via EMS.
>
> Anyway, for the decoding, is Moses still running? Is it using CPU? It
> could be swapping. If it's not running, did it crash? You could try running
> again with a small stack size (the -s parameter in the decoder-settings)
>
> cheers - Barry
>
>
> On 15/02/14 09:51, amir haghighi wrote:
>
>   Hello Barry
>
>  Thank you for your help.
>
> I commneted out the KenLm in the configuration file and used buil-binary
> to binarize lm file.( it doesn't create arpa file, should I create it
> manually? )
> It didn't get any error but it still takes too long to finish the testing
> process.
> I have 5000 sentences in my test set. I checked test.output.1 file and
> noticed that  it contains only the translation of 4690 sentences. it seems
> that it gets into a loop for translating the 4691'th sentence.
>  I am afraid that I made a mistake in setting the parameters in my
> configuration file. I will be thankful if you could take a look at my file.
>
>  Thank you again
>  amir
>
>
>
>
> On Fri, Feb 14, 2014 at 5:28 AM, Barry Haddow 
> wrote:
>
>>  Hi Amir
>>
>> Even if you use IRSTLM to build the language model, you can still use
>> KenLM for decoding. Make sure you create an arpa file with IRSTLM, then use
>> build_binary to binarise it so that it loads quickly with KenLM. Then you
>> can use multi-threaded decoding,
>>
>> cheers - Barry
>>
>>
>> On 14/02/14 13:01, amir haghighi wrote:
>>
>>   Thank you Barry,
>>
>>  I use IRSTLM to build the language model. Can I use multi-thread for
>> decoder-setting?
>>  I get "IRST LM is not threadsafe" error.
>>
>>  I want to use IRSTLM, is there any other way to speed up the tuning step
>> ?
>>
>>  Regrads
>>
>>
>> On Fri, Feb 14, 2014 at 1:53 AM, Barry Haddow > > wrote:
>>
>>>  Hi Amir
>>>
>>> You can add
>>>
>>> decoder-settings = "-threads 4"
>>>
>>> to your TUNING stanza.
>>>
>>> Also try
>>>
>>> filter-settings = "-MinScore 2:0.0001"
>>>
>>> for more aggressive filtering.
>>>
>>> Running tuning on a laptop though is always going to be slow,
>>>
>>> cheers - Barry
>>>
>>>
>>> On 14/02/14 09:26, amir haghighi wrote:
>>>
>>>  Thank you arezki and yohit
>>>
>>>  I don't know how can I change multi-thread setting in ems config file.
>>>
>>>
>>>
>>> On Fri, Feb 14, 2014 at 12:36 AM, Arezki Sadoune >> > wrote:
>>>
>>>>   Hello Amir
>>>> I think your tuning process will go faster if you use a multi-threaded
>>>> Mert.
>>>> /home/mert-moses.pl --threads 4
>>>>  you have of course tu indicate 8 instead of 4 if your laptop is
>>>> equipped with eight cores
>>>>  Best regards
>>>>
>>>>
>>>>   Le Vendredi 14 février 2014 8h27, amir haghighi <
>>>> amir.haghighi...@gmail.com> a écrit :
>>>>   Hello
>>>>
>>>>  I have a corpus with 400'000 sentences for training, 1000 sentences
>>>> for tuning and 100'000 sentences for test. I couldn't run ems on my corpus,
>>>> after 3 days, with my old laptop.
>>>> I have bought a new laptop (core i7, cpu 2.40 , 8G Ram) but I can't
>>>> still run ems! it is 3 days that it is in the tuning step and it is not
>>>> finished yet.
>>>>  Is it possible that it gets in an endless loop?
>>>>  How can I check it's process?
>>>>
>>>>  regards
>>>>  Amir
>>>>
>>>>  ___
>>>> Moses-support mailing list
>>>> Moses-support@mit.edu
>>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>>>
>>>>
>>>>
>>>
>>>
>>> ___
>>> Moses-support mailing 
>>> listMoses-support@mit.eduhttp://mailman.mit.edu/mailman/listinfo/moses-support
>>>
>>>
>>>
>>
>>
>> ___
>> Moses-support mailing 
>> listMoses-support@mit.eduhttp://mailman.mit.edu/mailman/listinfo/moses-support
>>
>>
>>
>
>
> ___
> Moses-support mailing 
> listMoses-support@mit.eduhttp://mailman.mit.edu/mailman/listinfo/moses-support
>
>
>
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


[Moses-support] getting zero bleu score on the test set

2014-02-15 Thread amir haghighi
Hello
my Ems system is trained with 400'000 sentences. the phrase table and
reordering model are filled correctly but I get ( test: 0.00 (1.100) BLEU)
by running Ems on my test set which has 500 sentence.
I don't know why this happens?

Thank you for any help
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] tuning takes tooooo long:(

2014-02-15 Thread amir haghighi
Hello Barry

Thank you for your help.

I commneted out the KenLm in the configuration file and used buil-binary to
binarize lm file.( it doesn't create arpa file, should I create it
manually? )
It didn't get any error but it still takes too long to finish the testing
process.
I have 5000 sentences in my test set. I checked test.output.1 file and
noticed that  it contains only the translation of 4690 sentences. it seems
that it gets into a loop for translating the 4691'th sentence.
I am afraid that I made a mistake in setting the parameters in my
configuration file. I will be thankful if you could take a look at my file.

Thank you again
amir




On Fri, Feb 14, 2014 at 5:28 AM, Barry Haddow wrote:

>  Hi Amir
>
> Even if you use IRSTLM to build the language model, you can still use
> KenLM for decoding. Make sure you create an arpa file with IRSTLM, then use
> build_binary to binarise it so that it loads quickly with KenLM. Then you
> can use multi-threaded decoding,
>
> cheers - Barry
>
>
> On 14/02/14 13:01, amir haghighi wrote:
>
>   Thank you Barry,
>
>  I use IRSTLM to build the language model. Can I use multi-thread for
> decoder-setting?
>  I get "IRST LM is not threadsafe" error.
>
>  I want to use IRSTLM, is there any other way to speed up the tuning step ?
>
>  Regrads
>
>
> On Fri, Feb 14, 2014 at 1:53 AM, Barry Haddow 
> wrote:
>
>>  Hi Amir
>>
>> You can add
>>
>> decoder-settings = "-threads 4"
>>
>> to your TUNING stanza.
>>
>> Also try
>>
>> filter-settings = "-MinScore 2:0.0001"
>>
>> for more aggressive filtering.
>>
>> Running tuning on a laptop though is always going to be slow,
>>
>> cheers - Barry
>>
>>
>> On 14/02/14 09:26, amir haghighi wrote:
>>
>>  Thank you arezki and yohit
>>
>>  I don't know how can I change multi-thread setting in ems config file.
>>
>>
>>
>> On Fri, Feb 14, 2014 at 12:36 AM, Arezki Sadoune 
>> wrote:
>>
>>>   Hello Amir
>>> I think your tuning process will go faster if you use a multi-threaded
>>> Mert.
>>> /home/mert-moses.pl --threads 4
>>>  you have of course tu indicate 8 instead of 4 if your laptop is
>>> equipped with eight cores
>>>  Best regards
>>>
>>>
>>>   Le Vendredi 14 février 2014 8h27, amir haghighi <
>>> amir.haghighi...@gmail.com> a écrit :
>>>   Hello
>>>
>>>  I have a corpus with 400'000 sentences for training, 1000 sentences for
>>> tuning and 100'000 sentences for test. I couldn't run ems on my corpus,
>>> after 3 days, with my old laptop.
>>> I have bought a new laptop (core i7, cpu 2.40 , 8G Ram) but I can't
>>> still run ems! it is 3 days that it is in the tuning step and it is not
>>> finished yet.
>>>  Is it possible that it gets in an endless loop?
>>>  How can I check it's process?
>>>
>>>  regards
>>>  Amir
>>>
>>>  ___
>>> Moses-support mailing list
>>> Moses-support@mit.edu
>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>>
>>>
>>>
>>
>>
>> ___
>> Moses-support mailing 
>> listMoses-support@mit.eduhttp://mailman.mit.edu/mailman/listinfo/moses-support
>>
>>
>>
>
>
> ___
> Moses-support mailing 
> listMoses-support@mit.eduhttp://mailman.mit.edu/mailman/listinfo/moses-support
>
>
>

### CONFIGURATION FILE FOR AN SMT EXPERIMENT ###


[GENERAL]

### directory in which experiment is run
#
working-dir = /opt/working/ems3

# specification of the language pair
input-extension = En
output-extension = Fa
pair-extension = En-Fa

### directories that contain tools and data
# 
# moses
moses-src-dir = /opt/tools/mosesdecoder
#
# moses binaries
moses-bin-dir = $moses-src-dir/bin
#
# moses scripts
moses-script-dir = $moses-src-dir/scripts
#
# directory where GIZA++/MGIZA programs resides
external-bin-dir = $moses-src-dir/tools
#
# srilm
#srilm-dir = $moses-src-dir/srilm/bin/i686
#
# irstlm
irstlm-dir = /opt/tools/irstlm/bin
#
# randlm
#randlm-dir = $moses-src-dir/randlm/bin
#
# data
toy-data = /opt/dataset/mizan

### basic tools
#
# moses decoder
decoder = $moses-bin-dir/moses

# conversion of phrase table into binary on-disk format
ttable-binarizer = $moses-bin-dir/processPhraseTable

# conversion of rule table into 

Re: [Moses-support] tuning takes tooooo long:(

2014-02-14 Thread amir haghighi
Thank you Barry,

I use IRSTLM to build the language model. Can I use multi-thread for
decoder-setting?
I get "IRST LM is not threadsafe" error.

I want to use IRSTLM, is there any other way to speed up the tuning step ?

Regrads


On Fri, Feb 14, 2014 at 1:53 AM, Barry Haddow wrote:

>  Hi Amir
>
> You can add
>
> decoder-settings = "-threads 4"
>
> to your TUNING stanza.
>
> Also try
>
> filter-settings = "-MinScore 2:0.0001"
>
> for more aggressive filtering.
>
> Running tuning on a laptop though is always going to be slow,
>
> cheers - Barry
>
>
> On 14/02/14 09:26, amir haghighi wrote:
>
>  Thank you arezki and yohit
>
>  I don't know how can I change multi-thread setting in ems config file.
>
>
>
> On Fri, Feb 14, 2014 at 12:36 AM, Arezki Sadoune 
> wrote:
>
>>   Hello Amir
>> I think your tuning process will go faster if you use a multi-threaded
>> Mert.
>> /home/mert-moses.pl --threads 4
>>  you have of course tu indicate 8 instead of 4 if your laptop is equipped
>> with eight cores
>>  Best regards
>>
>>
>>   Le Vendredi 14 février 2014 8h27, amir haghighi <
>> amir.haghighi...@gmail.com> a écrit :
>>   Hello
>>
>>  I have a corpus with 400'000 sentences for training, 1000 sentences for
>> tuning and 100'000 sentences for test. I couldn't run ems on my corpus,
>> after 3 days, with my old laptop.
>> I have bought a new laptop (core i7, cpu 2.40 , 8G Ram) but I can't still
>> run ems! it is 3 days that it is in the tuning step and it is not finished
>> yet.
>>  Is it possible that it gets in an endless loop?
>>  How can I check it's process?
>>
>>  regards
>>  Amir
>>
>>  ___
>> Moses-support mailing list
>> Moses-support@mit.edu
>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>
>>
>>
>
>
> ___
> Moses-support mailing 
> listMoses-support@mit.eduhttp://mailman.mit.edu/mailman/listinfo/moses-support
>
>
>
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] tuning takes tooooo long:(

2014-02-14 Thread amir haghighi
Thank you arezki and yohit

I don't know how can I change multi-thread setting in ems config file.



On Fri, Feb 14, 2014 at 12:36 AM, Arezki Sadoune wrote:

> Hello Amir
> I think your tuning process will go faster if you use a multi-threaded
> Mert.
> /home/mert-moses.pl --threads 4
>  you have of course tu indicate 8 instead of 4 if your laptop is equipped
> with eight cores
> Best regards
>
>
>   Le Vendredi 14 février 2014 8h27, amir haghighi <
> amir.haghighi...@gmail.com> a écrit :
>  Hello
>
> I have a corpus with 400'000 sentences for training, 1000 sentences for
> tuning and 100'000 sentences for test. I couldn't run ems on my corpus,
> after 3 days, with my old laptop.
> I have bought a new laptop (core i7, cpu 2.40 , 8G Ram) but I can't still
> run ems! it is 3 days that it is in the tuning step and it is not finished
> yet.
> Is it possible that it gets in an endless loop?
> How can I check it's process?
>
> regards
> Amir
>
> ___
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
>
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


[Moses-support] tuning takes tooooo long:(

2014-02-13 Thread amir haghighi
Hello

I have a corpus with 400'000 sentences for training, 1000 sentences for
tuning and 100'000 sentences for test. I couldn't run ems on my corpus,
after 3 days, with my old laptop.
I have bought a new laptop (core i7, cpu 2.40 , 8G Ram) but I can't still
run ems! it is 3 days that it is in the tuning step and it is not finished
yet.
Is it possible that it gets in an endless loop?
How can I check it's process?

regards
Amir
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


[Moses-support] exception during tuning step

2014-02-09 Thread amir haghighi
Hello all



when I run moses EMS, in the tuning step, it gives this exception:

Exception: moses/FF/Factory.cpp:235 in void
Moses::FeatureRegistry::Construct(const string&, const string&) threw
UnknownFeatureException because `i == registry_.end()'.
Feature name IRSTLM is not registered.
Exit code: 1
Failed to run moses with the config
/opt/working/ems/tuning/moses.filtered.ini.1 at
/opt/tools/mosesdecoder/scripts/training/mert-moses.pl line 1271.
cp: cannot stat '/opt/working/ems/tuning/tmp.1/moses.ini': No such file or
directory


I will be thankful if you could help me to solve this problem.

Regards

amir
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] word alignment-words' indexes and sentences' length

2014-01-24 Thread amir haghighi
Thank you Barry for your help.

Hi Amin,
I can't see the link. could you please attach it to your email?

Regrads
Amir
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] word alignment-words' indexes and sentences' length

2014-01-24 Thread amir haghighi
I use the built-in tokenizer in the Moses.
how can I change this tokenizer? should I change the  source code?

Regards
Amir
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] word alignment-words' indexes and sentences' length

2014-01-23 Thread amir haghighi
I removed all of the double spaces from the corpus but there are some
double spaces in the tokenised file yet.
My source language is Persian and I have half-spaces in my corpus. I
noticed that after the tokenisation step,these half-spaces are converted to
double-spaces. this conversion disturb the sentence's length and the
alignment.
How can I prevent from this conversion?

Thank you again
Amir


On Wed, Jan 22, 2014 at 2:10 PM, Hieu Hoang  wrote:

> yes, remove the double space. Sometimes, the double space is ignored,
> sometimes it's counted as a 'word' with no characters, depending on exactly
> how the program tokenizes the line.
>
>
>
>
> On 22 January 2014 10:09, amir haghighi wrote:
>
>> Thank you Hieu,
>>
>> The corpus is utf8, but there is a double space in this line. are double
>> spaces regarded as a word?
>> should I remove double spaces from the lines manually to get the correct
>> sentence's length?
>>
>>
>>
>> On Tue, Jan 21, 2014 at 4:12 AM, Hieu Hoang  wrote:
>>
>>>
>>> On 20/01/2014 13:45, amir haghighi wrote:
>>>
>>>   Hello
>>>
>>>  I've some questions about the giza word alignment.
>>>
>>>  1-where is the final alignment file?Is it the aligned.1.grow in
>>> the model folder?
>>>
>>> yes.
>>>
>>>
>>>  2-do indexes of the words of both target and source sentences start
>>> from 0?
>>>
>>> yes
>>>
>>>
>>>  3- how does giza calculate the length of a sentence?
>>>
>>> the number of words
>>>
>>>  I have a sentence with 11 tokens that are separated with space, but in
>>> the alignment file it length is 13.
>>>
>>> strange. Are you sure your corpus file is encoded as UTF8? Are there
>>> double spaces in the line?
>>>
>>>
>>>  Regards
>>>  Amir
>>>
>>>
>>>
>>> ___
>>> Moses-support mailing 
>>> listMoses-support@mit.eduhttp://mailman.mit.edu/mailman/listinfo/moses-support
>>>
>>>
>>>
>>
>> ___
>> Moses-support mailing list
>> Moses-support@mit.edu
>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>
>>
>
>
> --
> Hieu Hoang
> Research Associate
> University of Edinburgh
> http://www.hoang.co.uk/hieu
>
>
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] word alignment-words' indexes and sentences' length

2014-01-22 Thread amir haghighi
Thank you Hieu,

The corpus is utf8, but there is a double space in this line. are double
spaces regarded as a word?
should I remove double spaces from the lines manually to get the correct
sentence's length?



On Tue, Jan 21, 2014 at 4:12 AM, Hieu Hoang  wrote:

>
> On 20/01/2014 13:45, amir haghighi wrote:
>
>   Hello
>
>  I've some questions about the giza word alignment.
>
>  1-where is the final alignment file?Is it the aligned.1.grow in the
> model folder?
>
> yes.
>
>
>  2-do indexes of the words of both target and source sentences start from
> 0?
>
> yes
>
>
>  3- how does giza calculate the length of a sentence?
>
> the number of words
>
>  I have a sentence with 11 tokens that are separated with space, but in
> the alignment file it length is 13.
>
> strange. Are you sure your corpus file is encoded as UTF8? Are there
> double spaces in the line?
>
>
>  Regards
>  Amir
>
>
>
> ___
> Moses-support mailing 
> listMoses-support@mit.eduhttp://mailman.mit.edu/mailman/listinfo/moses-support
>
>
>
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


[Moses-support] word alignment-words' indexes and sentences' length

2014-01-20 Thread amir haghighi
Hello

I've some questions about the giza word alignment.

1-where is the final alignment file?Is it the aligned.1.grow in the
model folder?

2-do indexes of the words of both target and source sentences start from 0?

3- how does giza calculate the length of a sentence? I have a sentence with
11 tokens that are separated with space, but in the alignment file it
length is 13.

Regards
Amir
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


[Moses-support] how long should EVALUATION:test:decode step last?

2014-01-11 Thread amir haghighi
Hi,
I am running Ems to get the bleu score for 5 test sentences. It is in
EVALUATION:test:decode step for three days. I would like to know is it
normal? how long should this step last?

in this step, I also gets the following message:
Use of uninitialized value $post_decoding_transliteration in string eq at
/opt/tools/mosesdecoder/scripts/ems/experiment.perl line 2655.
 is this an error?
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] error during testing

2013-12-20 Thread amir haghighi
Thank you Mr Hieu.

I installed ubuntu on vmware.

Distributor ID:Ubuntu
Description:Ubuntu 12.04 LTS
Release:12.04
Codename:precise

gcc version:
gcc (Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3

moses code:
the latest version

what files do you need to replicate? only the model file that moses
creates? my dataset? or any other file?


Regards



Message: 2
Date: Fri, 20 Dec 2013 15:01:00 +
From: Hieu Hoang 
Subject: Re: [Moses-support] error during testing
To: moses-support@mit.edu
Message-ID: <52b45bac.1000...@gmail.com>
Content-Type: text/plain; charset="iso-8859-1"

there shouldn't be a problem with compiling moses on current gcc.

What is the gcc version? What OS, distribution, extact version are you
running? Can you please make your models available for download via
Dropbox or Google Drive. I will try to replicate and fix the problem.
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] error during testing

2013-12-19 Thread amir haghighi
 Dear Hieu

I got the latest code but the problem is not solved yet. It still gives the
segmentation fault error ~x(.
Is there any other way to solve this problem except downgrading gcc?

Regards
Amir





On Wed, Dec 18, 2013 at 8:22 PM, Hieu Hoang  wrote:

> ah, that's good. However, if you get the latest Moses code by doing
>git pull
> you shouldn't get the problem. It was fixed on the 15th November
>
> https://github.com/moses-smt/mosesdecoder/commit/17887a27969e83f4100bd0f4af98986e33999fbe
>
>
>
>
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] error during compiling moses

2013-12-19 Thread amir haghighi
Dear Kenneth Heafield

Thank you for your help. I installed libbz2-dev and the problem was solved.
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] error during testing

2013-12-19 Thread amir haghighi
Dear Mr Nadeem

Thank you very very much for your help.
During the previous month, I tried everything for solving this problem but
none of them had worked.
After your email, I remembered that I upgraded my gcc and probably it is
the main cause of the problem.

I will get the upgraded Moses code and hope the problem will be solved.
(thanks to Dr Hieu)

Regards
Amir




On Wed, Dec 18, 2013 at 8:22 PM, Hieu Hoang  wrote:

> ah, that's good. However, if you get the latest Moses code by doing
>git pull
> you shouldn't get the problem. It was fixed on the 15th November
>
> https://github.com/moses-smt/mosesdecoder/commit/17887a27969e83f4100bd0f4af98986e33999fbe
>
>
> On 18 December 2013 16:40, nadeem khan  wrote:
>
>> I tried everything , as I was working on ubuntu 13.10 and there we got
>> latest gcc and I am getting testing problem of segm fault and then I
>> downgraded by gcc ,G++ etc boost to 1.48 all tthings goes just fine...
>>
>> Regards
>> Nadeem
>>
>>
>>   On Wednesday, December 18, 2013 9:33 PM, Hieu Hoang <
>> hieu.ho...@ed.ac.uk> wrote:
>>  there shouldn't be a problem with gcc version. There was a problem but
>> it has been fixed
>>
>> http://article.gmane.org/gmane.comp.nlp.moses.user/9868/match=gcc+4.8.2
>> However, it doesn't create an empty data set.
>>
>> Can you please tell me exactly what is the command that produces the
>> segfault. And can you please make available for download (via dropbox or
>> google drive) the files needed for me to reproduce the error.
>>
>>
>>
>>
>> On 18 December 2013 16:08, nadeem khan  wrote:
>>
>>  hi amir;
>>
>>
>> it is problem of your gcc version try older gcc then it works fine.
>>
>>
>>On Wednesday, December 18, 2013 7:56 PM, amir haghighi <
>> amir.haghighi...@gmail.com> wrote:
>>Hello,
>>
>> My problem is not solved yet:(.
>>
>> I changed the test data several times, but every time it got the
>> "segmentation fault" error! the reordering table of the training data set
>> is not empty but for all of the test data sets, it is empty.
>> could anybody help me?
>>
>> Regards
>> Amir
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>  ___
>> Moses-support mailing listMoses-support@... 
>> <http://gmane.org/get-address.php?address=Moses%2dsupport%2d3s7WtUTddSA%40public.gmane.org>http://mailman.mit.edu/mailman/listinfo/moses-support
>>
>>
>> ___
>> Moses-support mailing list
>> Moses-support@mit.edu
>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>
>>
>>
>> ___
>> Moses-support mailing list
>> Moses-support@mit.edu
>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>
>>
>>
>>
>> --
>> Hieu Hoang
>> Research Associate
>> University of Edinburgh
>> http://www.hoang.co.uk/hieu
>>
>>
>>
>>
>
>
> --
> Hieu Hoang
> Research Associate
> University of Edinburgh
> http://www.hoang.co.uk/hieu
>
>
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] error during testing

2013-12-18 Thread amir haghighi
  Hello,

My problem is not solved yet:(.

I changed the test data several times, but every time it got the
"segmentation fault" error! the reordering table of the training data set
is not empty but for all of the test data sets, it is empty.
could anybody help me?

Regards
Amir



amir haghighi  wrote:

the file   model/reordering-table.* is not empty but the file
evaluation/*.filtered.*/reordering-table.1.*is!
 my test set is not empty.

 thank you for your answers.


On Sun, Dec 8, 2013 at 3:29 PM, Hieu Hoang <
hieu.hoang-5whefg1t...@public.gmane.org> wrote:

>   everything looks ok, I'm not sure why it's segfaulting
>
> is the file
>   model/reordering-table.*
> empty? If it is, then you should look in the log file
>   steps/*/TRAINING_build-reordering.*.STDERR
>
> or is
>   evaluation/*.filtered.*/reordering-table.1.*
> empty? is your test set empty?
>
>
>
> On 8 December 2013 09:47, amir haghighi  
> gmail.com
> > wrote:
>
>>  yes, the parallel data is UTF8.(one is UTF8 and one is ascii).
>> all of the pre-processioning  steps are done with moses scripts.
>>
>>  here is the EMS config file content:
>>
>> 
>> ### CONFIGURATION FILE FOR AN SMT EXPERIMENT ###
>> 
>>
>> [GENERAL]
>>
>> ### directory in which experiment is run
>> #
>> working-dir = /opt/tools/workingEms
>>
>> # specification of the language pair
>> input-extension = En
>> output-extension = Fa
>> pair-extension = En-Fa
>>
>> ### directories that contain tools and data
>> #
>> # moses
>> moses-src-dir =
>> /opt/tools/mosesdecoder-RELEASE-1.0/mosesdecoder-RELEASE-1.0
>> #
>> # moses binaries
>> moses-bin-dir = $moses-src-dir/bin
>> #
>> # moses scripts
>> moses-script-dir = $moses-src-dir/scripts
>> #
>> # directory where GIZA++/MGIZA programs resides
>> external-bin-dir = $moses-src-dir/tools
>> #
>> # srilm
>> #srilm-dir = $moses-src-dir/srilm/bin/i686
>> #
>> # irstlm
>> irstlm-dir = /opt/tools/irstlm/bin
>> #
>> # randlm
>> #randlm-dir = $moses-src-dir/randlm/bin
>> #
>> # data
>> toy-data = /opt/tools/dataset/mizan
>>
>> ### basic tools
>> #
>> # moses decoder
>> decoder = $moses-bin-dir/moses
>>
>> # conversion of phrase table into binary on-disk format
>> ttable-binarizer = $moses-bin-dir/processPhraseTable
>>
>> # conversion of rule table into binary on-disk format
>> #ttable-binarizer = "$moses-bin-dir/CreateOnDiskPt 1 1 5 100 2"
>>
>> # tokenizers - comment out if all your data is already tokenized
>> input-tokenizer = "$moses-script-dir/tokenizer/tokenizer.perl -a -l
>> $input-extension"
>> output-tokenizer = "$moses-script-dir/tokenizer/tokenizer.perl -a -l
>> $output-extension"
>>
>> # truecasers - comment out if you do not use the truecaser
>> input-truecaser = $moses-script-dir/recaser/truecase.perl
>> output-truecaser = $moses-script-dir/recaser/truecase.perl
>> detruecaser = $moses-script-dir/recaser/detruecase.perl
>>
>> ### generic parallelizer for cluster and multi-core machines
>> # you may specify a script that allows the parallel execution
>> # parallizable steps (see meta file). you also need specify
>> # the number of jobs (cluster) or cores (multicore)
>> #
>> #generic-parallelizer =
>> $moses-script-dir/ems/support/generic-parallelizer.perl
>> #generic-parallelizer =
>> $moses-script-dir/ems/support/generic-multicore-parallelizer.perl
>>
>> ### cluster settings (if run on a cluster machine)
>> # number of jobs to be submitted in parallel
>> #
>> #jobs = 10
>>
>> # arguments to qsub when scheduling a job
>> #qsub-settings = ""
>>
>> # project for priviledges and usage accounting
>> #qsub-project = iccs_smt
>>
>> # memory and time
>> #qsub-memory = 4
>> #qsub-hours = 48
>>
>> ### multi-core settings
>> # when the generic parallelizer is used, the number of cores
>> # specified here
>> cores = 8
>>
>> #
>> # PARALLEL CORPUS PREPARATION:
>> # create a tokenized, sentence-aligned corpus, ready for training
>>
>> [CORPUS]
>>
>> ### long sentences are filtered out, since they slow down GIZA++
>> # and are a less reliable source of data. set here the maximum
>> # length of a sentence
>>

Re: [Moses-support] error during testing

2013-12-08 Thread amir haghighi
the file   model/reordering-table.* is not empty but the file
evaluation/*.filtered.*/reordering-table.1.*is!
my test set is not empty.

thank you for your answers.


On Sun, Dec 8, 2013 at 3:29 PM, Hieu Hoang  wrote:

> everything looks ok, I'm not sure why it's segfaulting
>
> is the file
>   model/reordering-table.*
> empty? If it is, then you should look in the log file
>   steps/*/TRAINING_build-reordering.*.STDERR
>
> or is
>   evaluation/*.filtered.*/reordering-table.1.*
> empty? is your test set empty?
>
>
>
> On 8 December 2013 09:47, amir haghighi wrote:
>
>> yes, the parallel data is UTF8.(one is UTF8 and one is ascii).
>> all of the pre-processioning  steps are done with moses scripts.
>>
>> here is the EMS config file content:
>>
>> 
>> ### CONFIGURATION FILE FOR AN SMT EXPERIMENT ###
>> 
>>
>> [GENERAL]
>>
>> ### directory in which experiment is run
>> #
>> working-dir = /opt/tools/workingEms
>>
>> # specification of the language pair
>> input-extension = En
>> output-extension = Fa
>> pair-extension = En-Fa
>>
>> ### directories that contain tools and data
>> #
>> # moses
>> moses-src-dir =
>> /opt/tools/mosesdecoder-RELEASE-1.0/mosesdecoder-RELEASE-1.0
>> #
>> # moses binaries
>> moses-bin-dir = $moses-src-dir/bin
>> #
>> # moses scripts
>> moses-script-dir = $moses-src-dir/scripts
>> #
>> # directory where GIZA++/MGIZA programs resides
>> external-bin-dir = $moses-src-dir/tools
>> #
>> # srilm
>> #srilm-dir = $moses-src-dir/srilm/bin/i686
>> #
>> # irstlm
>> irstlm-dir = /opt/tools/irstlm/bin
>> #
>> # randlm
>> #randlm-dir = $moses-src-dir/randlm/bin
>> #
>> # data
>> toy-data = /opt/tools/dataset/mizan
>>
>> ### basic tools
>> #
>> # moses decoder
>> decoder = $moses-bin-dir/moses
>>
>> # conversion of phrase table into binary on-disk format
>> ttable-binarizer = $moses-bin-dir/processPhraseTable
>>
>> # conversion of rule table into binary on-disk format
>> #ttable-binarizer = "$moses-bin-dir/CreateOnDiskPt 1 1 5 100 2"
>>
>> # tokenizers - comment out if all your data is already tokenized
>> input-tokenizer = "$moses-script-dir/tokenizer/tokenizer.perl -a -l
>> $input-extension"
>> output-tokenizer = "$moses-script-dir/tokenizer/tokenizer.perl -a -l
>> $output-extension"
>>
>> # truecasers - comment out if you do not use the truecaser
>> input-truecaser = $moses-script-dir/recaser/truecase.perl
>> output-truecaser = $moses-script-dir/recaser/truecase.perl
>> detruecaser = $moses-script-dir/recaser/detruecase.perl
>>
>> ### generic parallelizer for cluster and multi-core machines
>> # you may specify a script that allows the parallel execution
>> # parallizable steps (see meta file). you also need specify
>> # the number of jobs (cluster) or cores (multicore)
>> #
>> #generic-parallelizer =
>> $moses-script-dir/ems/support/generic-parallelizer.perl
>> #generic-parallelizer =
>> $moses-script-dir/ems/support/generic-multicore-parallelizer.perl
>>
>> ### cluster settings (if run on a cluster machine)
>> # number of jobs to be submitted in parallel
>> #
>> #jobs = 10
>>
>> # arguments to qsub when scheduling a job
>> #qsub-settings = ""
>>
>> # project for priviledges and usage accounting
>> #qsub-project = iccs_smt
>>
>> # memory and time
>> #qsub-memory = 4
>> #qsub-hours = 48
>>
>> ### multi-core settings
>> # when the generic parallelizer is used, the number of cores
>> # specified here
>> cores = 8
>>
>> #
>> # PARALLEL CORPUS PREPARATION:
>> # create a tokenized, sentence-aligned corpus, ready for training
>>
>> [CORPUS]
>>
>> ### long sentences are filtered out, since they slow down GIZA++
>> # and are a less reliable source of data. set here the maximum
>> # length of a sentence
>> #
>> max-sentence-length = 80
>>
>> [CORPUS:toy]
>>
>> ### command to run to get raw corpus files
>> #
>> # get-corpus-script =
>>
>> ### raw corpus files (untokenized, but sentence aligned)
>> #
>> raw-stem = $toy-data/M_Tr
>>
>> ### tokenized corpus files (may contain long sentences)
>> #
>> #tokenized-stem =
>

Re: [Moses-support] error during testing

2013-12-08 Thread amir haghighi
enized-stem =

### trained model
#
# truecase-model =

##
## EVALUATION: translating a test set using the tuned system and score it

[EVALUATION]

### additional flags for the filter script
#
#filter-settings = ""

### additional decoder settings
# switches for the Moses decoder
# common choices:
#   "-threads N" for multi-threading
#   "-mbr" for MBR decoding
#   "-drop-unknown" for dropping unknown source words
#   "-search-algorithm 1 -cube-pruning-pop-limit 5000 -s 5000" for cube
pruning
#
decoder-settings = "-search-algorithm 1 -cube-pruning-pop-limit 5000 -s
5000"

### specify size of n-best list, if produced
#
#nbest = 100

### multiple reference translations
#
#multiref = yes

### prepare system output for scoring
# this may include detokenization and wrapping output in sgm
# (needed for nist-bleu, ter, meteor)
#
detokenizer = "$moses-script-dir/tokenizer/detokenizer.perl -l
$output-extension"
#recaser = $moses-script-dir/recaser/recase.perl
wrapping-script = "$moses-script-dir/ems/support/wrap-xml.perl
$output-extension"
#output-sgm =

### BLEU
#
nist-bleu = $moses-script-dir/generic/mteval-v13a.pl
nist-bleu-c = "$moses-script-dir/generic/mteval-v13a.pl -c"
#multi-bleu = $moses-script-dir/generic/multi-bleu.perl
#ibm-bleu =

### TER: translation error rate (BBN metric) based on edit distance
# not yet integrated
#
# ter =

### METEOR: gives credit to stem / worknet synonym matches
# not yet integrated
#
# meteor =

### Analysis: carry out various forms of analysis on the output
#
analysis = $moses-script-dir/ems/support/analysis.perl
#
# also report on input coverage
analyze-coverage = yes
#
# also report on phrase mappings used
report-segmentation = yes
#
# report precision of translations for each input word, broken down by
# count of input word in corpus and model
#report-precision-by-coverage = yes
#
# further precision breakdown by factor
#precision-by-coverage-factor = pos
#
# visualization of the search graph in tree-based models
#analyze-search-graph = yes

[EVALUATION:test]

### input data
#
input-sgm = $toy-data/M_Ts.$input-extension
# raw-input =
# tokenized-input =
# factorized-input =
# input =

### reference data
#
reference-sgm = $toy-data/M_Ts.$output-extension
# raw-reference =
# tokenized-reference =
# reference =

### analysis settings
# may contain any of the general evaluation analysis settings
# specific setting: base coverage statistics on earlier run
#
#precision-by-coverage-base = $working-dir/evaluation/test.analysis.5

### wrapping frame
# for nist-bleu and other scoring scripts, the output needs to be wrapped
# in sgm markup (typically like the input sgm)
#
wrapping-frame = $input-sgm

##
### REPORTING: summarize evaluation scores

[REPORTING]

### currently no parameters for reporting section

>
>

On Sat, Dec 7, 2013 at 7:21 PM, Hieu Hoang  wrote:

> are you sure the parallel data is encoded in UTF8? Was it tokenized,
> cleaned and escaped by the Moses scripts or by another external script?
>
> Can you please send me you EMS config file too
>
>
> On 7 December 2013 14:03, amir haghighi wrote:
>
>> Hi,
>>
>> I have also the same problem in evaluation step with EMS and I would be
>> thankful if you could help me.
>> the lexical reordering file is emtpy and the log of the output in
>> evaluation_test_filter.2.stderr is:
>>
>> Using SCRIPTS_ROOTDIR:
>> /opt/tools/mosesdecoder-RELEASE-1.0/mosesdecoder-RELEASE-1.0/scripts
>> (9) create moses.ini @ Sat Dec  7 04:50:15 PST 2013
>> Executing: mkdir -p /opt/tools/workingEms/evaluation/test.filtered.2
>> Considering factor 0
>> Considering factor 0
>> filtering /opt/tools/workingEms/model/phrase-table.2 ->
>> /opt/tools/workingEms/evaluation/test.filtered.2/phrase-table.0-0.1.1...
>> 0 of 2197240 phrases pairs used (0.00%) - note: max length 10
>> binarizing...cat
>> /opt/tools/workingEms/evaluation/test.filtered.2/phrase-table.0-0.1.1 |
>> LC_ALL=C sort -T /opt/tools/workingEms/evaluation/test.filtered.2 |
>>
>> /opt/tools/mosesdecoder-RELEASE-1.0/mosesdecoder-RELEASE-1.0/bin/processPhraseTable
>> -ttable 0 0 - -nscores 5 -out
>> /opt/tools/workingEms/evaluation/test.filtered.2/phrase-table.0-0.1.1
>> processing ptree for stdin
>> Segmentation fault (core dumped)
>> filtering
>> /opt/tools/workingEms/model/reordering-table.2.wbe-msd-bidirectional-fe.gz
>> ->
>>
>> /opt/tools/workingEms/evaluation/test.filtered.2/reordering-table.2.wbe-msd-bidirectional-fe...
>> 0 of 2197240 phrases pairs used (0.00%) - note: max length 10
>>
>> binarizing.../opt/tools/mosesdecoder-RELEASE-1.0/mosesdecoder-RELEASE-1.0/b

Re: [Moses-support] error during testing

2013-12-07 Thread amir haghighi
Hi,

I have also the same problem in evaluation step with EMS and I would be
thankful if you could help me.
the lexical reordering file is emtpy and the log of the output in
evaluation_test_filter.2.stderr is:

Using SCRIPTS_ROOTDIR:
/opt/tools/mosesdecoder-RELEASE-1.0/mosesdecoder-RELEASE-1.0/scripts
(9) create moses.ini @ Sat Dec  7 04:50:15 PST 2013
Executing: mkdir -p /opt/tools/workingEms/evaluation/test.filtered.2
Considering factor 0
Considering factor 0
filtering /opt/tools/workingEms/model/phrase-table.2 ->
/opt/tools/workingEms/evaluation/test.filtered.2/phrase-table.0-0.1.1...
0 of 2197240 phrases pairs used (0.00%) - note: max length 10
binarizing...cat
/opt/tools/workingEms/evaluation/test.filtered.2/phrase-table.0-0.1.1 |
LC_ALL=C sort -T /opt/tools/workingEms/evaluation/test.filtered.2 |
/opt/tools/mosesdecoder-RELEASE-1.0/mosesdecoder-RELEASE-1.0/bin/processPhraseTable
-ttable 0 0 - -nscores 5 -out
/opt/tools/workingEms/evaluation/test.filtered.2/phrase-table.0-0.1.1
processing ptree for stdin
Segmentation fault (core dumped)
filtering
/opt/tools/workingEms/model/reordering-table.2.wbe-msd-bidirectional-fe.gz
->
/opt/tools/workingEms/evaluation/test.filtered.2/reordering-table.2.wbe-msd-bidirectional-fe...
0 of 2197240 phrases pairs used (0.00%) - note: max length 10
binarizing.../opt/tools/mosesdecoder-RELEASE-1.0/mosesdecoder-RELEASE-1.0/bin/processLexicalTable
-in
/opt/tools/workingEms/evaluation/test.filtered.2/reordering-table.2.wbe-msd-bidirectional-fe
-out
/opt/tools/workingEms/evaluation/test.filtered.2/reordering-table.2.wbe-msd-bidirectional-fe
processLexicalTable v0.1 by Konrad Rawlik
processing
/opt/tools/workingEms/evaluation/test.filtered.2/reordering-table.2.wbe-msd-bidirectional-fe
to
/opt/tools/workingEms/evaluation/test.filtered.2/reordering-table.2.wbe-msd-bidirectional-fe.*
ERROR: empty lexicalised reordering file



Barry Haddow  writes:

>
> Hi Irene
>
>  > But the output is empty. And the errors are 1. segmentation fault
> 2. error: empty lexicalized
>  > reordering file
>
> Is this lexicalised reordering file empty then?
>
> It would be helpful if you could post the full log of the output when
> your run the filter command,
>
> cheers - Barry
>
> On 26/10/12 17:59, Irene Huang wrote:
> > Hi, I have trained and tuned the model, now I am using
> >
> >  ~/mosesdecoder/scripts/training/filter-model-given-input.pl
> >  filtered-newstest2011
> > mert-work/moses.ini ~/corpus/newstest2011.true.fr
> >   \
> >   -Binarizer ~/mosesdecoder/bin/processPhraseTable
> >
> > to filter the phrase table.
> >
> > But the output is empty. And the errors are 1. segmentation fault
> > 2. error: empty lexicalized reordering file
> >
> > So does this mean it's out of memory error?
> >
> > Thanks
> >
> >
> > ___
> > Moses-support mailing list
> > Moses-support@...
> > http://mailman.mit.edu/mailman/listinfo/moses-support
>
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


[Moses-support] (no subject)

2013-12-02 Thread amir haghighi
hello all,

I want to run EMS on my ubuntu system but it got the following error:

/graph.0.png' @ error/convert.c/ConvertImageCommand/3127.
convert: no decode delegate for this image format
`/tmp/magick-22773fbzHNDTaMj071' @ error/constitute.c/ReadImage/552.
convert: Postscript delegate failed `steps/0/graph.0.ps': No such file or
directory @ error/ps.c/ReadPSImage/837.
convert: no images defined `steps/0/graph.0.png' @
error/convert.c/ConvertImageCommand/3127.

I have already installed Imagemagick on mu system.



Regards
Amir
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support