Emad Nawfal (عماد نوفل) wrote:
Hi Tutors,
I want to color-code the different parts of the word in a
morphologically complex natural language. The file I have looks like
this, where the fisrt column is the word, and the second is the
composite part of speech tag. For example, Al is a DETERMINER, wlAy is
a NOUN and At is a PLURAL NOUN SUFFIX
Al+wlAy+At DET+NOUN+NSUFF_FEM_PL
Al+mtHd+p DET+ADJ+NSUFF_FEM_SG
The output I want is one on which the word has no plus signs, and each
segment is color-coded with a grammatical category. For example, the
noun is red, the det is green, and the suffix is orange. Like on this
page here:
http://docs.google.com/View?id=df7jv9p9_3582pt63cc4
I am stuck with the html part and I don't know where to start. I have
no experience with html, but I have this skeleton (which may not be
the right thing any way)
Any help with materials, modules, suggestions appreciated.
This skeleton of my program is as follows:
#############
RED = ("NOUN", "ADJ")
GREEN = ("DET", "DEMON")
ORANGE = ("NSUFF", "VSUFF", "ADJSUFF")
Instead of that use a dictionary:
colors = dict(NOUN="RED", ADJ="RED",DET ="GREEn",DEMON ="GREEN",
NSUFF="ORANGE", VSUFF="ORANGE", ADJSUFF="ORANGE")
# print html head
def print_html_head():
#print the head of the html page
def print_html_tail():
# print the tail of the html page
def color(segment, color):
# STUCK HERE shoudl take a color, and a segment for example
# main
import sys
infile = open(sys.argv[1]) # takes as input the POS-tagged file
print_html_head()
for line in infile:
line = line.split()
if len(line) != 2: continue
word = line[0]
pos = line[1]
zipped = zip(word.split("+"), pos.split("+"))
for x, y in zipped:
if y in DET:
color(x, "#FF0000")
else:
color(x, "#0000FF")
print_html_tail()
--
لا أعرف مظلوما تواطأ الناس علي هضمه ولا زهدوا في إنصافه
كالحقيقة.....محمد الغزالي
"No victim has ever been more repressed and alienated than the truth"
Emad Soliman Nawfal
Indiana University, Bloomington
--------------------------------------------------------
------------------------------------------------------------------------
_______________________________________________
Tutor maillist - Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor
--
Bob Gailer
Chapel Hill NC
919-636-4239
_______________________________________________
Tutor maillist - Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor