Looks like your udf expects numeric data but you are sending string type.
Suggest to cast to numeric.

On Thu, 13 Apr 2017 at 7:03 pm, issues solution <issues.solut...@gmail.com>
wrote:

> Hi
> I am newer in spark and i want ask you what wrang with checkpoint  On
> pyspark 1.6.0
>
> i dont unertsand what happen after i try to use it under datframe :
>    dfTotaleNormalize24 =  dfTotaleNormalize23.select([i if i not in
> listrapcot  else          udf_Grappra(F.col(i)).alias(i) for i in
> dfTotaleNormalize23.columns  ])
>
> dfTotaleNormalize24.cache()   <- cache on memory
> dfTotaleNormalize24.count <-matrialize dataframe(  rdd too ??)
> dfTotaleNormalize24.rdd.checkpoint() <- (cut DAG and save rdd not yet)
> dfTotaleNormalize24.rdd.count() <--- matrialize in file
>
> but why i get the following error :
>
>  java.lang.UnsupportedOperationException: Cannot evaluate expression:
>  PythonUDF#Grappra(input[410, StringType])
>
>
> thank's to explain all details and steps to save and check point
>
> Mydatframe it huge on with more than 5 Million rows and 1000 columns
>
> and udf befor are applied on more than 150 columns  it replace  ' ' by 0.0
> that all.
>
> regards
>
-- 
Best Regards,
Ayan Guha

Reply via email to