Please take a look at JavaUtils#mapAsSerializableJavaMap FYI
On Mon, May 16, 2016 at 3:24 AM, 喜之郎 <251922...@qq.com> wrote: > > hi, Ted. > I found a built-in function called str_to_map, which can transform string > to map. > But it can not meet my need. > > Because my string is maybe a map with a array nested in its value. > for example, map<string, Array<string>>. > I think it can not work fine in my situation. > > Cheers > > ------------------ 原始邮件 ------------------ > *发件人:* "喜之郎";<251922...@qq.com>; > *发送时间:* 2016年5月16日(星期一) 上午10:00 > *收件人:* "Ted Yu"<yuzhih...@gmail.com>; > *抄送:* "user"<user@spark.apache.org>; > *主题:* 回复: spark udf can not change a json string to a map > > this is my usecase: > Another system upload csv files to my system. In csv files, there are > complicated data types such as map. In order to express complicated data > types and ordinary string having special characters, we put urlencoded > string in csv files. So we use urlencoded json string to express > map,string and array. > > second stage: > load csv files to spark text table. > ############### > CREATE TABLE `a_text`( > parameters string > ) > load data inpath 'XXX' into table a_text; > ############# > Third stage: > insert into spark parquet table select from text table. In order to use > advantage of complicated data types, we use udf to transform a json > string to map , and put map to table. > > CREATE TABLE `a_parquet`( > parameters map<string,string> > ) > > insert into a_parquet select UDF(parameters ) from a_text; > > So do you have any suggestions? > > > > > > > ------------------ 原始邮件 ------------------ > *发件人:* "Ted Yu";<yuzhih...@gmail.com>; > *发送时间:* 2016年5月16日(星期一) 凌晨0:44 > *收件人:* "喜之郎"<251922...@qq.com>; > *抄送:* "user"<user@spark.apache.org>; > *主题:* Re: spark udf can not change a json string to a map > > Can you let us know more about your use case ? > > I wonder if you can structure your udf by not returning Map. > > Cheers > > On Sun, May 15, 2016 at 9:18 AM, 喜之郎 <251922...@qq.com> wrote: > >> Hi, all. I want to implement a udf which is used to change a json string >> to a map<string,string>. >> But some problem occurs. My spark version:1.5.1. >> >> >> my udf code: >> #################### >> public Map<String,String> evaluate(final String s) { >> if (s == null) >> return null; >> return getString(s); >> } >> >> @SuppressWarnings("unchecked") >> public static Map<String,String> getString(String s) { >> try { >> String str = URLDecoder.decode(s, "UTF-8"); >> ObjectMapper mapper = new ObjectMapper(); >> Map<String,String> map = mapper.readValue(str, Map.class); >> return map; >> } catch (Exception e) { >> return new HashMap<String,String>(); >> } >> } >> ############# >> exception infos: >> >> 16/05/14 21:05:22 ERROR CliDriver: >> org.apache.spark.sql.AnalysisException: Map type in java is unsupported >> because JVM type erasure makes spark fail to catch key and value types in >> Map<>; line 1 pos 352 >> at >> org.apache.spark.sql.hive.HiveInspectors$class.javaClassToDataType(HiveInspectors.scala:230) >> at >> org.apache.spark.sql.hive.HiveSimpleUDF.javaClassToDataType(hiveUDFs.scala:107) >> at org.apache.spark.sql.hive.HiveSimpleUDF.<init>(hiveUDFs.scala:136) >> ################ >> >> >> I have saw that there is a testsuite in spark says spark did not support >> this kind of udf. >> But is there a method to implement this udf? >> >> >> >