Collecting it as a regular (Java/scala/Python) map. You can also broadcast the map if your going to use it multiple times.
On Wednesday, July 1, 2015, Ashish Soni <asoni.le...@gmail.com> wrote: > Thanks , So if i load some static data from database and then i need to > use than in my map function to filter records what will be the best way to > do it, > > Ashish > > On Wed, Jul 1, 2015 at 10:45 PM, Raghavendra Pandey < > raghavendra.pan...@gmail.com > <javascript:_e(%7B%7D,'cvml','raghavendra.pan...@gmail.com');>> wrote: > >> You cannot refer to one rdd inside another rdd.map function... >> Rdd object is not serialiable. Whatever objects you use inside map >> function should be serializable as they get transferred to executor nodes. >> On Jul 2, 2015 6:13 AM, "Ashish Soni" <asoni.le...@gmail.com >> <javascript:_e(%7B%7D,'cvml','asoni.le...@gmail.com');>> wrote: >> >>> Hi All , >>> >>> I am not sure what is the wrong with below code as it give below error >>> when i access inside the map but it works outside >>> >>> JavaRDD<Charge> rdd2 = rdd.map(new Function<Charge, Charge>() { >>> >>> @Override >>> public Charge call(Charge ch) throws Exception { >>> >>> >>> * DataFrame df = accountRdd.filter("login=test");* >>> >>> return ch; >>> } >>> >>> }); >>> >>> 5/07/01 20:38:08 ERROR Executor: Exception in task 0.0 in stage 0.0 (TID >>> 0) >>> java.lang.NullPointerException >>> at org.apache.spark.sql.DataFrame.<init>(DataFrame.scala:129) >>> at org.apache.spark.sql.DataFrame.org >>> $apache$spark$sql$DataFrame$$logicalPlanToDataFrame(DataFrame.scala:154) >>> >> > -- Cell : 425-233-8271 Twitter: https://twitter.com/holdenkarau Linked In: https://www.linkedin.com/in/holdenkarau