Hi Per,

If I understand correctly, this might give what you want:

lt-expand apertium-swe.swe.dix | grep -E "[^<:>]+:[^<:>]+<n>" | sed -E
's/[^<:>]+:([^<:>]+).*/\1/g' | uniq

lt-expand lists all the forms, grep finds all the ones where the first tag
is <n>, sed gets rid of everything but the lemma, and uniq removes
duplicates.

Daniel

On Wed, Apr 22, 2020 at 7:54 AM Per Tunedal <per.tune...@operamail.com>
wrote:

> Hi,
> I need an ordinary dictionary of Swedish lemmas (just the lemmas, nothing
> else). How do I accomplish this?
>
> I read the Wiki:
> http://wiki.apertium.org/wiki/Dixtools:_Grep
>
> Thus I tried:
> apertium-dixtools grep --par '.*__n' apertium-swe.swe.dix
>
> but nothing was filtered. I got the whole file.
>
> I have a bit trouble using grep, as I find regular expressions a bit hard
> to grasp. Unfortunately, I often get it wrong and get unexpected results.
>
> Now, I would like a list of nouns (just the lemmas) for a start. Then I
> need lists of the other parts of speech, verbs for instance.
>
> The expression below from http://wiki.apertium.org/wiki/Dictionary_reader:
> apertium-dixtools dic-reader list-lemmas apertium-swe.swe.dix
> gives me ALL lemmas. But I would like to choose the part of speech.
>
> I'm running Ubuntu as an app on Windows 10.
>
> Please give me a hand!
>
> Yours,
> Per Tunedal
>
>
>
>
>
>
> _______________________________________________
> Apertium-stuff mailing list
> Apertium-stuff@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>
_______________________________________________
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff

Reply via email to