The Chunking tool might help here. Chunking means finding noun and verb phrases. This can help you find recurring phrases. Because German is agglutinative, this is probably a very different problem than in English. Are there any de-agglutinizer algorithms?
On Sun, Jun 24, 2012 at 11:43 PM, daniel stieger <[email protected]> wrote: > Hi, > > thanks a lot for your answers. My goal is to identify adjectives and nouns in > association sentence. Eg. What do you associate with our brand? Answer: nice > mountains, the mountains are very nice .. etc. > > > If appropriate, i would use the openNLP posTagger (it seams to be the most > elaborated java postagger) in order to identify nouns and adjectives. So when > i input the sentence "the", "mountains", "are", "nice" > the output is correct - also when using single words: > >>> [DT, NNS, VBP, JJ] >>> [DT] >>> [NNS] >>> [VBP] >>> [JJ] > > > Is the english model better than the german model? Do i have to build my own > model - or is the de-maxent appropriate? > > Generally - is openNLP a good choice for my task? > > Thanks again, > Dan > > > -------- Original-Nachricht -------- >> Datum: Sat, 23 Jun 2012 16:53:36 -0700 >> Von: Lance Norskog <[email protected]> >> An: [email protected] >> Betreff: Re: Newby Question on German POS Tagging > >> What would you like to find out about your data? Until we know that it >> is difficult to recommend a technique. >> >> On Sat, Jun 23, 2012 at 4:15 AM, Thilo Goetz <[email protected]> wrote: >> > On 22.06.2012 20:13, daniel stieger wrote: >> >> >> >> Hi List, >> >> >> >> i m looking for some suggestions and opinions for my task. The >> situation >> >> is this: >> >> >> >> In an online survey approx. 800 participants were asked a open text >> >> question like "What do you associate with our brand?". Participants can >> then >> >> enter 5 associations. Eg. >> >> >> >> - nature >> >> - beautifull mountains >> >> - relax >> >> - family friendly >> >> - very good service >> >> >> >> >> >> Now i just want to run the openNLP Post tagger over all associations. I >> >> suppose that i can use one association just as one sentence. Instead of >> the >> >> english model, i used the de-maxent.bin model and some german answers. >> But >> >> the tags are somehow wrong. Eg. >> >> >> >> sonne -> KON >> >> familie -> ART (it is a noun, definitely not an aricle) >> >> >> >> Am I on a wrong path? Should i handle my data differently? Or should i >> >> download an other model? Where can i get trainingdata ?? >> >> >> >> So many questions.. sorry.. but every hint appreciated, >> >> >> >> best, >> >> Daniel >> >> >> >> >> > >> > I'm pretty sure the model was trained on complete sentences. The >> > tagging takes context into account, and will not work properly >> > without it. So just running it on a couple of words at a time >> > will not work. >> > >> > If all your associations are NPs like your example, >> > you can maybe fix things by always prefixing "I like the ". In >> > German, maybe "Ich liebe ". >> > >> > HTH, >> > Thilo >> > >> > >> >> >> >> -- >> Lance Norskog >> [email protected] > > -- > NEU: FreePhone 3-fach-Flat mit kostenlosem Smartphone! > Jetzt informieren: http://mobile.1und1.de/?ac=OM.PW.PW003K20328T7073a -- Lance Norskog [email protected]
