Re: [Moses-support] kbmira segfault

2015-03-05 Thread Barry Haddow
Hi Matt

That seems right. When reading an nbest list, sparse and dense features 
are stored differently so you just need to know how many there are, 
whereas in hypergraphs all features look like sparse features. This 
needs a cleanup ...

cheers - Barry

On 05/03/15 19:33, Matt Post wrote:
> Yes, passing --dense-init worked. Although, it seems to ignore the feature 
> names: so long as I have enough lines matching the number of dense 
> parameters, it works, and it always outputs the following:
>
>  477/3000 updates, avg loss = 0.36341, BLEU = 0.356527
>  F0 3.663
>  F1 0.221152
>  F2 0.186323
>  F3 1.41851
>  F4 2.38853
>  F5 -0.162657
>  F6 0.430753
>  F7 3.93281
>
> Does that sound correct?
>
>
>> On Mar 5, 2015, at 10:34 AM, Barry Haddow  wrote:
>>
>> Hi Matt
>>
>> This was part of the changes to support hypergraph mira, since the 
>> hypergraphs don't have the FEATURES_TXT_BEGIN_0 sections. In fact they don't 
>> differentiate between sparse and dense features.
>>
>> Does it work correctly when you use the --dense-init paramater?
>>
>> cheers - Barry
>>
>> On 05/03/15 15:18, Matt Post wrote:
>>> Okay, the old kbmira works, so this must be part of the 3.0 changes.
>>>
>>> It seems that the names of features in the header line 
>>> (FEATURES_TXT_BEGIN_0) are ignored entirely. The 2.1 kbmira would output 
>>> dense feature weights using names F1..FN, which I would then re-map back to 
>>> the list in the header. In kbmira 3.0, it uses the file passed in, as Barry 
>>> pointed out.
>>>
>>> Thanks for your help!
>>>
>>> matt
>>>
>>>
 On Feb 27, 2015, at 1:21 PM, Matt Post >>> > wrote:

 Although, those old successful runs might have been with the old Moses 
 kbmira. I'll look into this and report back.

 matt


> On Feb 27, 2015, at 12:19 PM, Matt Post  > wrote:
>
> Hi Barry — Thanks for the response. I don't think that's it, because I 
> use the exact same approach for lots of other tuning runs. Isn't it the 
> header line of the features file that lists dense features? I've been 
> using this format, where dense features are listed in each header line, 
> and then sparse features in the individual lines:
>
> FEATURES_TXT_BEGIN_0 0 300 9 lm_0 lm_1 tm_pt_1 tm_pt_3 tm_pt_0 tm_pt_2 
> WordPenalty PhrasePenalty Distortion
> -82.183 -72.639 -79.162 -41.493 -60.118 -28.509 -10.857 19 -8
> -82.183 -72.639 -79.162 -41.493 -60.118 -28.509 -10.857 19 -8 
> OOVPenalty=-100
>
> This works in lots of places (although, it also raises a separate 
> question, of whether kbmira actually distinguishes between sparse and 
> dense features? I seem to remember Colin once saying that there is a 
> single group weight between the two groups, but I've never been able to 
> find this in the code).
>
> matt
>
>
>> On Feb 26, 2015, at 5:35 PM, Barry Haddow > > wrote:
>>
>> Hi Matt
>>
>> When mert-moses.pl runs kbmira, it always supplies a list of the dense 
>> features (and their initial values) using the --dense-init parameter. I 
>> think this is your problem. I've attached a typical file used for this 
>> feature list.
>>
>> Of course, kbmira should have a sensible message rather than a segfault. 
>> This is probably my doing,
>>
>> cheers - Barry
>>
>> On 26/02/15 22:18, Matt Post wrote:
>>> kbmira segfaults on the following command:
>>>
>>> kbmira run --ffile run1.features.dat --scfile run1.scores.dat -o 
>>> mert.out
>>>
>>> Where run1.features.dat (30 MB) and run1.scores.dat (14 MB) can be 
>>> downloaded here:
>>>
>>> https://www.dropbox.com/s/yim7ub1bmq5jv2g/run1.features.dat?dl=0
>>> https://www.dropbox.com/s/kkek36o7aflgzuu/run1.scores.dat?dl=0
>>>
>>> I tracked it down to this line of mert/FeatureStats.cpp.
>>>
>>> std::string SparseVector::decode(std::size_t id)
>>> {
>>> return m_id_to_name[id];
>>> }
>>>
>>> Any obvious ideas before I go down this rabbit hole? I verified there 
>>> are no blank lines or anything else funny with the formatting, at least 
>>> as far as I can tell (all dense features, plus one sparse feature, 
>>> OOVPenalty=-100, showing up occasionally).
>>>
>>> matt
>>>
>>>
>>>
>>>
>>>
>>> ___
>>> Moses-support mailing list
>>> Moses-support@mit.edu
>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>> 
> ___
> Moses-support mailing list
> Moses-support@mit.edu 
> http://mailman.mit.edu/mailman/listinfo/moses-support
 ___
 Moses-support mailing list
 Moses-support@mi

[Moses-support] mert-moses.pl

2015-03-05 Thread mohamed hasanien
HI all, i try to train english arabic system and every thing is ok untill i try 
to tuning the system using this command 
 nohup nice ~/mosesdecoder/scripts/training/mert-moses.pl   
~/thesiscorups/tuning.true.en ~/thesiscorups/tuning.true.ar  
~/mosesdecoder/bin/moses train/model/moses.ini --mertdir ~/mosesdecoder/bin/i 
found in the mert-work folder only one itration and  run1.moses.ini
and i also open the out file i found the following lines in the end of the file 
---
BEST TRANSLATION: مجلس التجارة والتنمية ، [11]  [total=-1.972] 
core=(0.000,-4.000,1.000,-2.303,-17.198,-1.386,-10.604,-0.511,0.000,0.000,0.000,0.000,0.000,0.000,-1$Line
 4738: Decision rule took 0.000 seconds totalLine 4738: Additional reporting 
took 0.020 seconds totalLine 4738: Translation took 0.308 seconds 
totalTranslating: takes note with appreciation of the technical cooperation 
activities carried out by the UNCTAD secretariat and of the reports prepared 
for the Working Par$Line 4739: Initialize search took 1.250 seconds totalLine 
4739: Collecting options took 0.562 seconds at moses/Manager.cpp:117sh: line 1: 
13550 Killed                  /mhmd/mosesdecoder/bin/moses -config 
filtered/moses.ini -weight-overwrite 'PhrasePenalty0= 0.043478 WordPenalty0= 
-0.217391 T$Exit code: 137The decoder died. CONFIG WAS -weight-overwrite 
'PhrasePenalty0= 0.043478 WordPenalty0= -0.217391 TranslationModel0= 0.043478 
0.043478 0.043478 0.043478 Distortion0= 0.$
can any one told me what is the problem and how i can solve it ___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] kbmira segfault

2015-03-05 Thread Matt Post
Yes, passing --dense-init worked. Although, it seems to ignore the feature 
names: so long as I have enough lines matching the number of dense parameters, 
it works, and it always outputs the following:

477/3000 updates, avg loss = 0.36341, BLEU = 0.356527
F0 3.663
F1 0.221152
F2 0.186323
F3 1.41851
F4 2.38853
F5 -0.162657
F6 0.430753
F7 3.93281

Does that sound correct?


> On Mar 5, 2015, at 10:34 AM, Barry Haddow  wrote:
> 
> Hi Matt
> 
> This was part of the changes to support hypergraph mira, since the 
> hypergraphs don't have the FEATURES_TXT_BEGIN_0 sections. In fact they don't 
> differentiate between sparse and dense features.
> 
> Does it work correctly when you use the --dense-init paramater?
> 
> cheers - Barry
> 
> On 05/03/15 15:18, Matt Post wrote:
>> Okay, the old kbmira works, so this must be part of the 3.0 changes.
>> 
>> It seems that the names of features in the header line 
>> (FEATURES_TXT_BEGIN_0) are ignored entirely. The 2.1 kbmira would output 
>> dense feature weights using names F1..FN, which I would then re-map back to 
>> the list in the header. In kbmira 3.0, it uses the file passed in, as Barry 
>> pointed out.
>> 
>> Thanks for your help!
>> 
>> matt
>> 
>> 
>>> On Feb 27, 2015, at 1:21 PM, Matt Post >> > wrote:
>>> 
>>> Although, those old successful runs might have been with the old Moses 
>>> kbmira. I'll look into this and report back.
>>> 
>>> matt
>>> 
>>> 
 On Feb 27, 2015, at 12:19 PM, Matt Post >>> > wrote:
 
 Hi Barry — Thanks for the response. I don't think that's it, because I use 
 the exact same approach for lots of other tuning runs. Isn't it the header 
 line of the features file that lists dense features? I've been using this 
 format, where dense features are listed in each header line, and then 
 sparse features in the individual lines:
 
 FEATURES_TXT_BEGIN_0 0 300 9 lm_0 lm_1 tm_pt_1 tm_pt_3 tm_pt_0 tm_pt_2 
 WordPenalty PhrasePenalty Distortion
 -82.183 -72.639 -79.162 -41.493 -60.118 -28.509 -10.857 19 -8
 -82.183 -72.639 -79.162 -41.493 -60.118 -28.509 -10.857 19 -8 
 OOVPenalty=-100
 
 This works in lots of places (although, it also raises a separate 
 question, of whether kbmira actually distinguishes between sparse and 
 dense features? I seem to remember Colin once saying that there is a 
 single group weight between the two groups, but I've never been able to 
 find this in the code).
 
 matt
 
 
> On Feb 26, 2015, at 5:35 PM, Barry Haddow  > wrote:
> 
> Hi Matt
> 
> When mert-moses.pl runs kbmira, it always supplies a list of the dense 
> features (and their initial values) using the --dense-init parameter. I 
> think this is your problem. I've attached a typical file used for this 
> feature list.
> 
> Of course, kbmira should have a sensible message rather than a segfault. 
> This is probably my doing,
> 
> cheers - Barry
> 
> On 26/02/15 22:18, Matt Post wrote:
>> kbmira segfaults on the following command:
>> 
>> kbmira run --ffile run1.features.dat --scfile run1.scores.dat -o mert.out
>> 
>> Where run1.features.dat (30 MB) and run1.scores.dat (14 MB) can be 
>> downloaded here:
>> 
>> https://www.dropbox.com/s/yim7ub1bmq5jv2g/run1.features.dat?dl=0
>> https://www.dropbox.com/s/kkek36o7aflgzuu/run1.scores.dat?dl=0
>> 
>> I tracked it down to this line of mert/FeatureStats.cpp.
>> 
>> std::string SparseVector::decode(std::size_t id)
>> {
>> return m_id_to_name[id];
>> }
>> 
>> Any obvious ideas before I go down this rabbit hole? I verified there 
>> are no blank lines or anything else funny with the formatting, at least 
>> as far as I can tell (all dense features, plus one sparse feature, 
>> OOVPenalty=-100, showing up occasionally).
>> 
>> matt
>> 
>> 
>> 
>> 
>> 
>> ___
>> Moses-support mailing list
>> Moses-support@mit.edu
>> http://mailman.mit.edu/mailman/listinfo/moses-support
> 
> 
 
 ___
 Moses-support mailing list
 Moses-support@mit.edu 
 http://mailman.mit.edu/mailman/listinfo/moses-support
>>> 
>>> ___
>>> Moses-support mailing list
>>> Moses-support@mit.edu 
>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>> 
> 
> 
> -- 
> The University of Edinburgh is a charitable body, registered in
> Scotland, with registration number SC005336.


___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] kbmira segfault

2015-03-05 Thread Barry Haddow
Hi Matt

This was part of the changes to support hypergraph mira, since the 
hypergraphs don't have the FEATURES_TXT_BEGIN_0 sections. In fact they 
don't differentiate between sparse and dense features.

Does it work correctly when you use the --dense-init paramater?

cheers - Barry

On 05/03/15 15:18, Matt Post wrote:
> Okay, the old kbmira works, so this must be part of the 3.0 changes.
>
> It seems that the names of features in the header line 
> (FEATURES_TXT_BEGIN_0) are ignored entirely. The 2.1 kbmira would 
> output dense feature weights using names F1..FN, which I would then 
> re-map back to the list in the header. In kbmira 3.0, it uses the file 
> passed in, as Barry pointed out.
>
> Thanks for your help!
>
> matt
>
>
>> On Feb 27, 2015, at 1:21 PM, Matt Post > > wrote:
>>
>> Although, those old successful runs might have been with the old 
>> Moses kbmira. I'll look into this and report back.
>>
>> matt
>>
>>
>>> On Feb 27, 2015, at 12:19 PM, Matt Post >> > wrote:
>>>
>>> Hi Barry — Thanks for the response. I don't think that's it, because 
>>> I use the exact same approach for lots of other tuning runs. Isn't 
>>> it the header line of the features file that lists dense features? 
>>> I've been using this format, where dense features are listed in each 
>>> header line, and then sparse features in the individual lines:
>>>
>>> FEATURES_TXT_BEGIN_0 0 300 9 lm_0 lm_1 tm_pt_1 tm_pt_3 tm_pt_0 
>>> tm_pt_2 WordPenalty PhrasePenalty Distortion
>>> -82.183 -72.639 -79.162 -41.493 -60.118 -28.509 -10.857 19 -8
>>> -82.183 -72.639 -79.162 -41.493 -60.118 -28.509 -10.857 19 -8 
>>> OOVPenalty=-100
>>>
>>> This works in lots of places (although, it also raises a separate 
>>> question, of whether kbmira actually distinguishes between sparse 
>>> and dense features? I seem to remember Colin once saying that there 
>>> is a single group weight between the two groups, but I've never been 
>>> able to find this in the code).
>>>
>>> matt
>>>
>>>
 On Feb 26, 2015, at 5:35 PM, Barry Haddow 
 mailto:bhad...@staffmail.ed.ac.uk>> wrote:

 Hi Matt

 When mert-moses.pl runs kbmira, it always supplies a list of the 
 dense features (and their initial values) using the --dense-init 
 parameter. I think this is your problem. I've attached a typical 
 file used for this feature list.

 Of course, kbmira should have a sensible message rather than a 
 segfault. This is probably my doing,

 cheers - Barry

 On 26/02/15 22:18, Matt Post wrote:
> kbmira segfaults on the following command:
>
> kbmira run --ffile run1.features.dat --scfile run1.scores.dat -o 
> mert.out
>
> Where run1.features.dat (30 MB) and run1.scores.dat (14 MB) can be 
> downloaded here:
>
> https://www.dropbox.com/s/yim7ub1bmq5jv2g/run1.features.dat?dl=0
> https://www.dropbox.com/s/kkek36o7aflgzuu/run1.scores.dat?dl=0
>
> I tracked it down to this line of mert/FeatureStats.cpp.
>
> std::string SparseVector::decode(std::size_t id)
> {
> return m_id_to_name[id];
> }
>
> Any obvious ideas before I go down this rabbit hole? I verified 
> there are no blank lines or anything else funny with the 
> formatting, at least as far as I can tell (all dense features, 
> plus one sparse feature, OOVPenalty=-100, showing up occasionally).
>
> matt
>
>
>
>
>
> ___
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support

 
>>>
>>> ___
>>> Moses-support mailing list
>>> Moses-support@mit.edu 
>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>
>> ___
>> Moses-support mailing list
>> Moses-support@mit.edu 
>> http://mailman.mit.edu/mailman/listinfo/moses-support
>


-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.

___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] kbmira segfault

2015-03-05 Thread Matt Post
Okay, the old kbmira works, so this must be part of the 3.0 changes.

It seems that the names of features in the header line (FEATURES_TXT_BEGIN_0) 
are ignored entirely. The 2.1 kbmira would output dense feature weights using 
names F1..FN, which I would then re-map back to the list in the header. In 
kbmira 3.0, it uses the file passed in, as Barry pointed out.

Thanks for your help!

matt


> On Feb 27, 2015, at 1:21 PM, Matt Post  wrote:
> 
> Although, those old successful runs might have been with the old Moses 
> kbmira. I'll look into this and report back.
> 
> matt
> 
> 
>> On Feb 27, 2015, at 12:19 PM, Matt Post > > wrote:
>> 
>> Hi Barry — Thanks for the response. I don't think that's it, because I use 
>> the exact same approach for lots of other tuning runs. Isn't it the header 
>> line of the features file that lists dense features? I've been using this 
>> format, where dense features are listed in each header line, and then sparse 
>> features in the individual lines:
>> 
>>  FEATURES_TXT_BEGIN_0 0 300 9 lm_0 lm_1 tm_pt_1 tm_pt_3 tm_pt_0 tm_pt_2 
>> WordPenalty PhrasePenalty Distortion 
>>  -82.183 -72.639 -79.162 -41.493 -60.118 -28.509 -10.857 19 -8 
>>  -82.183 -72.639 -79.162 -41.493 -60.118 -28.509 -10.857 19 -8 
>> OOVPenalty=-100
>> 
>> This works in lots of places (although, it also raises a separate question, 
>> of whether kbmira actually distinguishes between sparse and dense features? 
>> I seem to remember Colin once saying that there is a single group weight 
>> between the two groups, but I've never been able to find this in the code).
>> 
>> matt
>> 
>> 
>>> On Feb 26, 2015, at 5:35 PM, Barry Haddow >> > wrote:
>>> 
>>> Hi Matt
>>> 
>>> When mert-moses.pl runs kbmira, it always supplies a list of the dense 
>>> features (and their initial values) using the --dense-init parameter. I 
>>> think this is your problem. I've attached a typical file used for this 
>>> feature list.
>>> 
>>> Of course, kbmira should have a sensible message rather than a segfault. 
>>> This is probably my doing,
>>> 
>>> cheers - Barry
>>> 
>>> On 26/02/15 22:18, Matt Post wrote:
 kbmira segfaults on the following command:
 
 
 kbmira run --ffile run1.features.dat --scfile run1.scores.dat -o 
 mert.out
 
 Where run1.features.dat (30 MB) and run1.scores.dat (14 MB) can be 
 downloaded here:
 
 
 https://www.dropbox.com/s/yim7ub1bmq5jv2g/run1.features.dat?dl=0 
 
 
 https://www.dropbox.com/s/kkek36o7aflgzuu/run1.scores.dat?dl=0 
 
 
 I tracked it down to this line of mert/FeatureStats.cpp.
 
 std::string SparseVector::decode(std::size_t id)
 {
   return m_id_to_name[id];
 }
 
 Any obvious ideas before I go down this rabbit hole? I verified there are 
 no blank lines or anything else funny with the formatting, at least as far 
 as I can tell (all dense features, plus one sparse feature, 
 OOVPenalty=-100, showing up occasionally).
 
 matt
 
 
 
 
 
 ___
 Moses-support mailing list
 Moses-support@mit.edu 
 http://mailman.mit.edu/mailman/listinfo/moses-support 
 
>>> 
>>> 
>> 
>> ___
>> Moses-support mailing list
>> Moses-support@mit.edu 
>> http://mailman.mit.edu/mailman/listinfo/moses-support
> 
> ___
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support

___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


[Moses-support] Compiling Moses Decoder v3.0 using Visual Studio 2010

2015-03-05 Thread Muhammad Danial Raza
Hello all,

We are implementing machine translation for one of our application and
after some research we chose Moses to use. Our application is coded in C#
.NET. I am facing problems to compile the Moses decoder using *VS 2010*.

The steps I followed to compile:

   - Downloaded *Moses Release 3.0* from GitHub
   - Downloaded *Boost 1.55* and compiled it.
   - Opened 'moses.sln' located at 'mosesdecoder/contrib/other-builds"
   - On building the solution received many errors "No such File or
   Directory", removed that by changing the source file paths in '.vcprojx'
   and then changing the 'library' and 'include' paths in project properties.
   - Now, have succesfully built 'moses', 'kenlm' and 'OnDiskPt' project of
   the solution and generated '.lib' of these projects.
   - But when I built 'moses-cmd' I received errors "unresolved external
   symbol". After some research I found out that this is because of the fact
   that 'moses.sln' contains files from some very old version of Moses and
   therefore either some referenced files of the current release are missing
   from the 'moses' project or have different name in the current release.
   - Now for about two days, I am trying to resolve these issues by adding
   the referenced files, but the problem is that each time I add a file some
   errors are removed and some new errors are added of some other referenced
   files. Moreover, I am not able to remove some of these errors because the
   referenced files are already part of the solution.

We cannot move to linux for decoding because we don't have a linux server
to deploy the decoder. Any help as to how can I compile the decoder using
VS (be it by removing the errors or by building a solution file from
scratch) would be highly appreciated.

In case, if anyone have some queries or need some files to peek into the
current situation feel free to ask for it.

Thanks & Regards,
Muhammad Danial Raza
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support