Hello guys,
I`m again. I`m trying to classify a portuguese text following the demo
tutorial (http://predictionio.incubator.apache.org/demo/textclassification/
).
Someone already perform this with predictionIo? How could be the better way
to i lead with stemming and stop portuguese words?
Allow me to take this opportunity to do another question. Someone has
problem with encoding? My csv load file is in ISO-8859 and in python script
i`m transforming my text to utf-8.
text_utf8 = text.decode('iso-8859-1').encode('utf-8')
client.create_event(
event="documents",
entity_type="source",
entity_id=str(count), # use the count num as user ID
properties= {
"text" : text_utf8,
"category" : attr[2],
"label" : int(attr[3])
}
)
When i retrive event from http://localhost:7070/events.json i got a
encoded word. Is it right?
{"eventId":"x","event":"documents","entityType":"source","entityId":"73","properties":{"category":"A","text":"Gest\u008bo
de
Caixa","label":2},"eventTime":"2017-01-19T12:31:27.863Z","creationTime":"2017-01-19T12:31:27.867Z"}
I really appreciate your attention.
--
Marcus Vinicius A. Silva
*P* *ANTES DE IMPRIMIR pense em sua responsabilidade e compromisso
com o MEIO AMBIENTE.*