Hi,

Has someone tried running NLTK (python) with Spark Streaming (scala)? I was
wondering if this is a good idea and what are the right Spark operators to
do this? The reason we want to try this combination is that we don't want
to run our transformations in python (pyspark), but after the
transformations, we need to run some natural language processing operations
and we don't want to restrict the functions data scientists' can use to
Spark natural language library. So, Spark streaming with NLTK looks like
the right option, from the perspective of fast data processing and data
science flexibility.

Regards,
Ashish

Reply via email to