Dear all, I would like to know if it's possible to get ngrams without containing line breaks from the corpus. I'll try to explain clearly: if the input text file is
first line of text second line And a third line of text Then, we'll get with count.pl two bigrams containing like breaks: text second line And Or trigrams: of text second text second line second line And And so on. Taking into account these outputs, and after reading help text, I don't know if I can change default count.pl options to get all ngrams from the corpus except the ngrams containing words placed at the end of one sentence and words that are at the begining of the next sentence. That is, ngram without containing line breaks. Best wishes, Mercè