Meant to include: I have this function which seems to work, but I am not sure if it is always correct:
def octet_length(s): return len(s.encode(‘utf8’)) sqlContext.registerFunction('octet_length', lambda x: octet_length(x)) > On Mar 8, 2016, at 12:30 PM, Cramblit, Ross (Reuters News) > <ross.cramb...@thomsonreuters.com> wrote: > > I am trying to define a UDF to calculate octet_length of a string but I am > having some trouble getting it right. Does anyone have a working version of > this already/any pointers? > > I am using Spark 1.5.2/Python 2.7. > > Thanks > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org