Re: [Moses-support] SLF to PLF - tests - moses crashing

Sylvain Raybaud Mon, 09 Aug 2010 06:57:54 -0700

Hello

  Here is the latest version of the script, which replace noise information 
with epsilon transitions. Moses successfully translates very small "toy" 
lattice generated with this script but on a real world example 
(http://perso.crans.org/raybaud/realworld.slf.gz) Moses fails with glibc 
complaining about a corrupted double-linked list:
sylv...@markov /global/markov/raybauds/XP/S2T $ nice ~/softs/moses/moses-
cmd/src/moses -max-phrase-length 100000 -inputtype 2 -weight-i 1 -f working-
dir/filtered-OlliRehn-1/moses.ini < realworld.plf > realworld.plf.translated
Defined parameters (per moses.ini or switch):
        config: working-dir/filtered-OlliRehn-1/moses.ini 
        distortion-file: 0-0 msd-bidirectional-fe 6 
/global/markov/raybauds/XP/S2T/working-dir/filtered-OlliRehn-1/reordering-table 
        distortion-limit: 6 
        input-factors: 0 
        inputtype: 2 
        lmodel-file: 0 0 5 /home/sylvain/GMR/SYSTEMS/WMT09/lm/europarl.lm 
        mapping: 0 T 0 
        max-phrase-length: 100000 
        ttable-file: 0 0 5 /global/markov/raybauds/XP/S2T/working-dir/filtered-
OlliRehn-1/phrase-table 
        ttable-limit: 20 
        weight-d: 0.054012 0.196064 -0.003181 0.081071 0.086735 0.062332 
0.090617 
        weight-i: 1 
        weight-l: 0.064254 
        weight-t: 0.013876 0.081935 0.059305 0.013773 0.145934 
        weight-w: 0.046909 
Loading lexical distortion models...have 1 models
[...]
Finished loading phrase tables : [13.000] seconds
IO from STDOUT/STDIN
Created input-output object : [13.000] seconds
WARN: probability > 1: 1.492
*** glibc detected *** /home/sylvain/softs/moses/moses-cmd/src/moses: 
corrupted double-linked list: 0x000000001e7d8610 ***


I used the very latest version, fresh trunk from this morning. Of course I 
can't say if this is because my lattices are ill formed or because of moses.
So I recompiled it without optimisation (CFLAGS = -g -ggdb -O0, CXXFLAGS also) 
to get a full backtrace:

#0  0x00007ffff70f81b5 in raise () from /lib/libc.so.6
#1  0x00007ffff70f95e0 in abort () from /lib/libc.so.6
#2  0x00007ffff71333d7 in __libc_message () from /lib/libc.so.6
#3  0x00007ffff7138966 in malloc_printerr () from /lib/libc.so.6
#4  0x00007ffff713a398 in _int_free () from /lib/libc.so.6
#5  0x00007ffff713d71c in free () from /lib/libc.so.6
#6  0x000000000040d820 in __gnu_cxx::new_allocator<unsigned long>::deallocate 
(this=0x1d8d7738, __p=0x1da04c60) at /usr/lib/gcc/x86_64-pc-linux-
gnu/4.4.4/include/g++-v4/ext/new_allocator.h:95
#7  0x000000000040c45e in std::_Bvector_base<std::allocator<bool> 
>::_M_deallocate (this=0x1d8d7738) at /usr/lib/gcc/x86_64-pc-linux-
gnu/4.4.4/include/g++-v4/bits/stl_bvector.h:444
#8  0x000000000040b6a1 in ~_Bvector_base (this=0x1d8d7738, __in_chrg=<value 
optimized out>) at /usr/lib/gcc/x86_64-pc-linux-gnu/4.4.4/include/g++-
v4/bits/stl_bvector.h:430
#9  0x000000000040a6d6 in ~vector (this=0x1d8d7738, __in_chrg=<value optimized 
out>) at /usr/lib/gcc/x86_64-pc-linux-gnu/4.4.4/include/g++-
v4/bits/stl_bvector.h:547
#10 0x00000000004b87df in std::_Destroy<std::vector<bool, std::allocator<bool> 
> > (__pointer=0x1d8d7738) at /usr/lib/gcc/x86_64-pc-linux-
gnu/4.4.4/include/g++-v4/bits/stl_construct.h:83
#11 0x00000000004b82e1 in 
std::_Destroy_aux<false>::__destroy<std::vector<bool, std::allocator<bool> >*> 
(__first=0x1d8d7738, __last=0x1d8e56a8)
    at /usr/lib/gcc/x86_64-pc-linux-gnu/4.4.4/include/g++-
v4/bits/stl_construct.h:93
#12 0x00000000004b77ee in std::_Destroy<std::vector<bool, std::allocator<bool> 
>*> (__first=0x1d8c0470, __last=0x1d8e56a8) at /usr/lib/gcc/x86_64-pc-linux-
gnu/4.4.4/include/g++-v4/bits/stl_construct.h:116
#13 0x00000000004b68f9 in std::_Destroy<std::vector<bool, std::allocator<bool> 
>*, std::vector<bool, std::allocator<bool> > > (__first=0x1d8c0470, 
__last=0x1d8e56a8)
    at /usr/lib/gcc/x86_64-pc-linux-gnu/4.4.4/include/g++-
v4/bits/stl_construct.h:142
#14 0x00000000004b6030 in ~vector (this=0x7fffffffcc30, __in_chrg=<value 
optimized out>) at /usr/lib/gcc/x86_64-pc-linux-gnu/4.4.4/include/g++-
v4/bits/stl_vector.h:313
#15 0x00000000004b5141 in Moses::WordLattice::Read (this=0x1caf2d70, in=..., 
factorOrder=...) at WordLattice.cpp:110
#16 0x0000000000416d2b in IOWrapper::GetInput (this=0x1caf0760, 
inputType=0x1caf2d70) at IOWrapper.cpp:171
#17 0x0000000000418c1c in ReadInput (ioWrapper=..., 
inputType=Moses::WordLatticeInput, sour...@0x7fffffffcf48) at IOWrapper.cpp:511
#18 0x00000000004086c4 in main (argc=11, argv=0x7fffffffd0f8) at Main.cpp:398

As I understand it, it crashed while it reads the lattice. So there must be 
something wrong in my script. It can't be probability above one, can it?

Any hint is more than welcome...

regards,

Sylvain


On Friday 06 August 2010 11:44:25 Sylvain Raybaud wrote:
> Hello
> 
>   A few days ago I came to you asking if someone knew of a script to turn
> word lattice from SLF (HTK format) into PLF (Moses input format). It seems
> that doesn't exist yet, so here is my proposal (python file attached).
> 
> I tried to include all necessary information in help message and in file
> header.
> 
> basic use :
> slf2plf input > output
> slf2plf < input > output
> slf2plf -h for help
> 
> it works on small "toy" lattices, I checked the results. I've run it on
> real world lattices but I have not checked the results yet.
> 
> I tried to keep as much flexibility of SLF format as possible, but
> sublattices won't work yet (they will be parsed but in the resulting graph
> they will appear as a single node. I haven't checked this though).
> 
> Also, for the moment the software requires that words are specified on
> links (not nodes) and that acoustic and language model scores are present.
> The resulting score in output graph is:
> acoustic+language*lmscale
> 
> all testing, remarks and requests welcome (of course). I can make the
> script accessible on a mercurial repository if you want.
> 
> Hope this helps
> 
> regards,

-- 
Sylvain

#!/usr/bin/env python
# -*- coding: utf-8 -*-
#
#
# Copyright (C) 2010, Sylvain Raybaud <srayb...@crans.org>
# All rights reserved.
#
# Redistribution and use in source and binary forms, with or without
# modification, are permitted provided that the following conditions are met:
#     * Redistributions of source code must retain the above copyright
#       notice, this list of conditions and the following disclaimer.
#     * Redistributions in binary form must reproduce the above copyright
#       notice, this list of conditions and the following disclaimer in the
#       documentation and/or other materials provided with the distribution.
#     * Neither the name of Sylvain Raybaud nor the names of its contributors may
#       be used to endorse or promote products derived from this software
#       without specific prior written permission.
#
# THIS SOFTWARE IS PROVIDED BY Sylvain Raybaud <srayb...@crans.org> ``AS IS'' AND ANY
# EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
# WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
# DISCLAIMED. IN NO EVENT SHALL Sylvain Raybaud BE LIABLE FOR ANY
# DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
# (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
# LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
# ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
# SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
#
#
# This piece of software turns a word lattice in SLF (HTK format) and converts it
# to PLF (Moses input format)
# SLF is documented there: http://labrosa.ee.columbia.edu/doc/HTKBook21/node257.html
# PLF is documented there: http://www.statmt.org/moses/?n=Moses.WordLattices
# I tried to keep most of the flexibility of SLF
#
# Usage:  slf2plf [options] input_file > output
# or:     slf2plf [options] input_file.gz > output
# or:     slf2plf [options] < input_file > output
# Options:
#    -C, --dont-clean: don't remove noise information from input lattice
#    -S, --dont-sanitize: don't sanitize input words. Be careful (see help)
#    -h, --help: print short help
#
# Principle:
#  1) load graph into memory as a list of nodes and a list of links
#     each node has access to the list of its incoming and outgoing
#     links
#  2) Compute a topological ordering of nodes
#  3) Output lattice in moses input format
#
# The algorithm implemented for topological sort is the one described here on the 6 august 2010:
#   http://en.wikipedia.org/wiki/Topological_sorting 
#
# TODO:
#  a) Make the program robust to missing accoustic and lm scores in the lattice
#  b) Make it possible to choose how output lattice score is computed from nodes and link attributes
#  c) Deal with sublattices
#  d) Make the program robust to incorrect input


USAGE = """
    %prog [options] input_file > output
or: %prog [options] input_file.gz > output
or: %prog [options] < input_file > output
input_file is a graph in SLF format (HTK), possibly gziped (.gz extension required)."""

from optparse import OptionParser
import gzip
import sys
import re
import copy
from math import exp


#general purpose tools to make your life easier

def open_plain_or_gz(filename,mode='r'):
    """Transparently open gziped or plain files, based on file extension"""
    return gzip.open(filename,mode) if filename.endswith(".gz") else open(filename,mode)

def verbose_msg(opts,msg):
    if opts.verbose:
        sys.stderr.write(msg+'\n')

def warn(msg):
    """Print a warning message to stderr. The program continues."""
    sys.stderr.write("*** WARNING *** %s\n" % msg)
    
def error(msg,code=1):
    """Print an error message to stderr and exists with the given code (default: 1)."""
    sys.stderr.write("*** ERROR *** %s\n" % msg)
    exit(code)


# regular expressions that will be used later

__attr_re__ = r"(?P<attribute>[^=]+)=(?P<quote>(\"|'')?)(?P<value>.*)(?P=quote)"  # filed_name=field_value    value may be "quoted"
__size_re__ = r"(?P<attribute>N|NODES|L|LINKS)=(?P<value>\d+)"  # for parsing size information in SLF
__node_re__ = r"I=(?P<id>\d+)(?P<attributes>(\s+%s)*)" % __attr_re__  # for parsing node definition in SLF
__link_re__ = r"J=(?P<id>\d+)(?P<attributes>(\s+%s)*)" % __attr_re__  # for parsing link information in SLF
__noise_re__ = r"\+\+[^\+]*\+\+"

# constants
EPSILON = "*EPS*"  # epsilon transition


# links to be cleaned out if requested
CLEAN = ['<sil>']

# word sanitizing
def __sanitize_word__(string):
    s = string.replace("'","\\'")
    return s

# a few mandatory attributes in short and long forms. They will be stored in long form

ATTRIBUTES = {
    'E': 'END',
    'S': 'START',
    'W': 'WORD',
    'a': 'acoustic',
    'l': 'language'
    }

def __get_long_form__(string):
    if ATTRIBUTES.has_key(string):
        return ATTRIBUTES[string]
    else:
        return string

# remove comments and blank lines from input
def __clean__(lines):
    data = []
    for l in lines:
        if l.strip()!='' and l[0]!='#':
            data.append(re.sub("#.*\n",'',l))
    return data

# print a link out of the first node in the list 'remains'
def __print_links_out__(lattice,remains,options):
    n = lattice.nodes[remains[0]]
    sys.stdout.write("(")
    for l in n.outgoing:
        link = lattice.links[l]
        distance = remains.index(link.to_node)  # this is the distance between current node and the node it is linked to
        a = float(link.attributes['acoustic'])
        l = float(link.attributes['language'])
        w = link.attributes['WORD']
        if options.clean and (w in CLEAN or re.match(__noise_re__,w)):
            w = EPSILON
        if options.sanitize:
            w = __sanitize_word__(w)
        score = a+l*lattice.lmscale if not options.exp else (pow(lattice.logbase,a+l*lattice.lmscale) if lattice.logbase!=None else exp(a+l*lattice.lmscale))
        sys.stdout.write("('%s',%f,%i)," % (w,score,distance))
    sys.stdout.write("),")

class Node:
    """ Node definition for using in word lattice """
    def __init__(self,index,attributes_string=None):
        """index is just the id of the node. attribute_string is a succession fo definition of the form: name=value"""
        self.index = index
        self.incoming = set()
        self.outgoing = set()
        self.attributes = {}
        if attributes_string:
            attributes = attributes_string.split()
            for attr_str in attributes:
                m = re.match(__attr_re__,attr_str)
                attr = __get_long_form__(m.group('attribute'))
                val = m.group('value')
                if self.attributes.has_key(attr):
                    warn("Node %i: attribute '%s' redefined." % (self.index,attr))
                self.attributes[attr] = val
                

class Link:
    """ Link definition for using in word lattice """
    def __init__(self,index,attributes_string=None):
        """'index' is just the id of the link.
        'attribute_string' is a succession fo definition of the form: name=value
        'sanitize' is a flag indicating if words should be santized (e.g. escaping quote)"""
        self.index = index
        self.from_node = -1
        self.to_node = -1
        self.word = ""
        self.attributes = {}
        if attributes_string:
            attributes = attributes_string.split()
            for attr_str in attributes:
                m = re.match(__attr_re__,attr_str)
                attr = __get_long_form__(m.group('attribute'))
                val = m.group('value')
                if self.attributes.has_key(attr):
                    warn("Node %i: attribute '%s' redefined." % (self.index,attr))
                self.attributes[attr] = val
        try:
            self.from_node = int(self.attributes['START'])
            self.to_node = int(self.attributes['END'])
        except:
            error("Incomplete definition of link %i" % self.index)
        #TODO: check that word attribute is defined either here or on destination node

class Lattice:
    """ Class representing a word lattice with constructor and method for loading a lattice in HTK Standard Lattice Format.
    Also has a method for writing the lattice in Moses input format (PLF) """
    def __init__(self,lattfile=None,lines=None):  # can read lattice either from file or data in memory
        """ lattfile should be an open stream. lines should contain a reversed list of already read lines. At least one of them must be provided."""
        data = []
        if lattfile==None and (lines==None or len(lines)==0):
            error("Where shall I read the lattice from?")
        if lattfile!=None:
            data = lattfile.readlines()  #not very memory efficient, but easier for parsing
            data.reverse()  # lines to be parsed will be stored on a stack
        if lines!=None:
            data = data+lines
        data = __clean__(data)  #remove comments and empty lines
        self.nodes_count = -1
        self.links_count = -1
        self.nodes = {}
        self.links = {}
        self.sublattices = {}
        self.lmscale = 1.0
        self.logbase = None
        self.attributes = {}
        self.__load_graph__(data)
    
    def __load_graph__(self,lines):
        self.__parse_header__(lines)
        self.__parse_counts__(lines)
        self.__parse_nodes__(lines)
        self.__parse_links__(lines)
        if len(self.nodes)!=self.nodes_count or len(self.links)!=self.links_count:
            warn("Incorrect nodes or links count")
        if self.attributes.has_key('lmscale'):
            self.lmscale = float(self.attributes['lmscale'])
        if self.attributes.has_key('base'):
            self.logbase = float(self.attributes['base'])
     
    def __parse_header__(self,lines):
        attributes_tmp = {}
        go = True  # carry on as long as we are parsing a header
        while len(lines)>0 and go:
            l = lines.pop()
            m = re.match(__attr_re__,l)
            if m and not re.match(__size_re__+".*",l):  # should match an attribute but *not* a size definition
                attr = m.group('attribute')
                val = m.group('value')
                if attributes_tmp.has_key(attr):
                    warn("Attribute '%s' redefined." % attr)
                attributes_tmp[attr] = val
                if attr=="SUBLAT": # we are parsing a sublattice. go down, create it and then carry on.
                    sublat = Lattice(lines=lines)
                    # merge attributes
                    for k in attributes_tmp:
                        if sublat.attributes.has_key(k):
                            warn("Sublattice '%s': attribute '%s' redefined." % (val,k))
                    sublat.attributes.update(attributes_tmp)
                    attributes_tmp = {}
                    self.sublattices[val] = sublat
                    if lines.pop().strip()!=":":
                        warn("Missing colon at the end of sublattice '%s'. Continuing anyway, but things could go wrong." % val)
            else:
                go = False
        if not go:
            lines.append(l)
        self.attributes = attributes_tmp
    
    # If I understand correctly counts may be written in any order, on 1 or 2 lines
    # Therefore parsed lines need to be split
    def __parse_counts__(self,lines):
        go = True
        while len(lines)>0 and go:
            l = lines.pop()
            terms = l.split()
            while len(terms)>0 and go:
                t = terms.pop()
                m = re.match(__size_re__,t)
                if m:
                    if m.group('attribute')=='N' or m.group('attribute')=='NODES':
                        if self.nodes_count!=-1:
                            warn("Nodes count redefined.")
                        self.nodes_count = int(m.group('value'))
                    elif m.group('attribute')=='L' or m.group('attribute')=='LINKS':
                        if self.links_count!=-1:
                            warn("Links count redefined.")
                        self.links_count = int(m.group('value'))
                else:
                    go = False
        if not go:
            lines.append(l)
        
        
    def __parse_nodes__(self,lines):
        go = True
        while len(lines)>0 and go:
            l = lines.pop()
            m = re.match(__node_re__,l)
            if m:
                index = int(m.group('id'))
                n = Node(index,m.group('attributes'))
                if self.nodes.has_key(index):
                    warn("Node %i redefined." % index)
                self.nodes[index] = n
            else:
                go = False
        if not go:
            lines.append(l)
        
    def __parse_links__(self,lines):
        removed_links = 0
        go = True
        while len(lines)>0 and go:
            l = lines.pop()
            m = re.match(__link_re__,l)
            if m:
                index = int(m.group('id'))
                if self.links.has_key(index):
                    warn("Link %i redefined." % index)
                e = Link(index,m.group('attributes'))
                self.__update_nodes__(e)  # Information about nodes is contained in link definition
                self.links[index] = e
            else:
                go = False
        if not go:
            lines.append(l)
    
    def __update_nodes__(self,link):
        # in case the nodes were not explicitely defined
        if not self.nodes.has_key(link.from_node):
            self.nodes[link.from_node] = Node(link.from_node)
        if not self.nodes.has_key(link.to_node):
            self.nodes[link.to_node] = Node(link.to_node)
        # update connection information of nodes
        self.nodes[link.from_node].outgoing.add(link.index)
        self.nodes[link.to_node].incoming.add(link.index)
    
    def remove_orphan_nodes(self):
        """ Put orphaned nodes out of their misery """
        removed_nodes = 0
        for n in self.nodes.values():
            if len(n.incoming)==0 and len(n.outgoing)==0:
                self.nodes.pop(n.index)
                self.nodes_count -= 1
    
    #NOT TESTED!!
    def prune_noise(self,cleanList):
        """ [NOT TESTED] Remove noise information from the graph without breaking connectivity.
        'cleanList' is a list of additionnal words that should cause a link to be removed."""
        for link in self.links.values():
            # get a link that should be removed
            if 'WORD' in link.attributes and ((link.attributes['WORD'] in cleanList) or (re.match(__noise_re__,link.attributes['WORD']))):
                src = self.nodes[link.from_node]
                tgt = self.nodes[link.to_node]
                # get all links out of its target node
                for l in tgt.outgoing:
                    # create a new, identical link
                    new = copy.deepcopy(self.links[l])
                    self.links_count += 1
                    new.index = self.links_count
                    # insert this link into lattice
                    self.links[new.index] = new
                    # add this link between src and target of l
                    new.from_node = src.index
                    src.outgoing.add(new.index)
                    self.nodes[new.to_node].incoming.add(new.index)
                # remove link
                src.outgoing.remove(link.index)
                tgt.incoming.remove(link.index)
                self.links.pop(link.index)
                self.links_count -= 1
        self.remove_orphan_nodes()


    # write lattice in PLF
    def write_plf(self,options):
        """ Write lattice in moses input format (PLF) on standard output.
        options.sanitize: sanitize input words (e.g. escape quotes)
        options.clean: replace noise and silence information with epsilon transitions
        options.exp: raise score to exp before writing"""
        # get topological ordering
        ordered_nodes = topological_ordering(self)
        # NOT THAT EASY
        # ordered nodes of a sublattice can be inserted at the position of the node they replace
        #for i,n in len(ordered_nodes):
            #if self.nodes[n].has_key('L'):  # this node is in fact a sublattice
                #sublat_ordered = self.sublattices
        # write lattice
        sys.stdout.write("(")
        for i,n in enumerate(ordered_nodes[:-1]):  # list all nodes in topological order
            if len(self.nodes[n].outgoing)>0:
                __print_links_out__(self,ordered_nodes[i:],options)  # and write links leaving them
        sys.stdout.write(")\n")
        return

# topological sorting algorithm needs to be initialised with a list of initial nodes having no incoming link
def get_starting_nodes(lattice):
    """ Given an instance of class Lattice, output the list of all nodes with no incoming link """
    S = []
    for n in lattice.nodes.values():
        if len(n.incoming)==0:
            S.append(n.index)
    return S

def topological_ordering(original_lattice):
    """ Given an instance of class Lattice, output a topological ordering of nodes (as a list of indexes) """
    lattice = copy.deepcopy(original_lattice)  # the lattice will be modified by the program. Work on a copy.
    start_nodes = get_starting_nodes(lattice)
    ordered_nodes = []
    while len(start_nodes)>0:
        n = start_nodes.pop()
        ordered_nodes.append(n)
        for l in set(lattice.nodes[n].outgoing):
            m = lattice.nodes[lattice.links[l].to_node]
            lattice.links.pop(l)
            m.incoming.remove(l)
            lattice.nodes[n].outgoing.remove(l)
            if len(m.incoming)==0:
                start_nodes.append(m.index)
    if len(lattice.links)>0:
        warn("Lattice has at least one cycle.")
    return ordered_nodes

if __name__=='__main__':
    parser = OptionParser(usage=USAGE)
    parser.add_option('-C','--dont-clean',default=True,action="store_false",dest="clean",help="Word lattice graph generated by sphinx contain information about noise and silences which should be replaced with epsilon transition for translation. Use this option if you want to keep it. Cannot be used with 'prune-noise'. [clean]")
    parser.add_option('-n','--prune-noise',default=False,action="store_true",dest="prune_noise",help="Remove transitions labeled with noise or silence in a way that preserves graph connectivity. Cannot be used with 'clean'. [%default]")
    parser.add_option('-S','--dont-sanitize',default=True,action="store_false",dest="sanitize",help="Don't sanitize input words. Be careful! for example if there is a quote in your graph moses won't be able to decode it. [sanitize]")
    parser.add_option('-E','--no-exponential',default=True,action="store_false",dest="exp",help="Sphinx lattice normaly contains log probabilities while Moses expect raw probabilities. Use this option you want to keep the score the way they are. [use exponential]")
    parser.add_option('-v','--verbose',default=False,action="store_true",dest="verbose",help="verbosity [%default]")
    opts,args = parser.parse_args()
    if len(args)>1:
        parser.print_help()
        error("Too many arguments.",1)
    if opts.clean and opts.prune_noise:
        warn("'prune_noise' requested. Disabling 'clean'.")
        opts.clean = False
    input_file = sys.stdin if (len(args)==0 or args[0]=="-") else open_plain_or_gz(args[0])
    lattice = Lattice(input_file)
    if opts.prune_noise:
        lattice.prune_noise(CLEAN)
    lattice.write_plf(opts)
    if not (len(args)==0 or args[0]=="-"):
        input_file.close()

_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

Re: [Moses-support] SLF to PLF - tests - moses crashing

Reply via email to