Which Hadoop version are you using? Seems the exception you got was caused by incompatible hadoop version.
Best, Haoyuan On Wed, Dec 10, 2014 at 12:30 AM, 十六夜涙 <cr...@qq.com> wrote: > Hi All, > I'v read official docs of tachyon,It seems not fit my usage,For my > understanding,It just cache files in memory,but I have a file contains > over million lines amount about 70mb,retrieveing data and mapping to a > *Map* varible will costs over serveral minuts,which I dont want to > process it each time in map function.since tachyon occurs another problem > raise an exception while doing *./bin/tachyon format* > The exception: > Exception in thread "main" java.lang.RuntimeException: > org.apache.hadoop.ipc.RemoteException: Server IPC version 9 cannot > communicate with client version 4 > It seems there's a compatibility problem with hadoop,but even solved it > there's still an efficient issue as I described above. > could somebody tell me how to persist the data in memory.for now I just > broadcast it, and re-submit spark application while the broadcast value > unavaible. > > > > ------------------ 原始邮件 ------------------ > *发件人:* "Akhil Das";<ak...@sigmoidanalytics.com>; > *发送时间:* 2014年12月9日(星期二) 下午3:42 > *收件人:* "十六夜涙"<cr...@qq.com>; > *抄送:* "user"<u...@spark.incubator.apache.org>; > *主题:* Re: spark broadcast unavailable > > You cannot pass the sc object (*val b = Utils.load(sc,ip_lib_path)*) > inside a map function and that's why the Serialization exception is popping > up( since sc is not serializable). You can try tachyon's cache if you want > to persist the data in memory kind of forever. > > Thanks > Best Regards > > On Tue, Dec 9, 2014 at 12:12 PM, 十六夜涙 <cr...@qq.com> wrote: > >> Hi all >> In my spark application,I load a csv file and map the datas to a Map >> vairable for later uses on driver node ,then broadcast it,every thing works >> fine untill the exception java.io.FileNotFoundException occurs.the console >> log information shows me the broadcast unavailable,I googled this >> problem,says spark will clean up the broadcast,while these's an >> solution,the author mentioned about re-broadcast,I followed this >> way,written some exception handle code with `try` ,`catch`.after compliling >> and submitting the jar,I faced anthoner problem,It shows " task >> not serializable". >> so here I have there options: >> 1,get the right way persisting broadcast >> 2,solve the "task not serializable" problem re-broadcast variable >> 3,save the data to some kind of database,although I prefer save data in >> memory. >> >> here is come code snippets: >> val esRdd = kafkaDStreams.flatMap(_.split("\\n")) >> .map{ >> case esregex(datetime, time_request) => >> var ipInfo:Array[String]=Array.empty >> try{ >> ipInfo = Utils.getIpInfo(client_ip,b.value) >> }catch{ >> case e:java.io.FileNotFoundException =>{ >> val b = Utils.load(sc,ip_lib_path) >> ipInfo = Utils.getIpInfo(client_ip,b.value) >> } >> } >> >> > > -- Haoyuan Li AMPLab, EECS, UC Berkeley http://www.cs.berkeley.edu/~haoyuan/