Hi, I'm not a programmer. I start working as text miner and as a first task I have given 1000 dirty files that needs to be cleaned before classification tasks. I have been told python is the best tool for this job.
Each file's structure as below: Comments: This is article 1965 obtained from the website Title: Banana Report #65, September 2003 Author: dylab Date: 1st September 2003 Section: pulse In the past month: A mass hit North America, cutting electricity to 50 million people across the North east I'm expected execute the python script so the file suppose to look like this: pulse, In, the, past, month, A, mass, hit, North, America, cutting, electricity, to, 50, million, people, across, the, North east, dylab Could you please point me to right direction here. Or provide some example code. In the mean time I'll be searching myself. I know you guys hate novice people like me but I would appreciated if you could provide little help here. Thanks & regards, Sez -- http://mail.python.org/mailman/listinfo/python-list
