You can put the database files in a central location accessible to all the workers and build the GeoIP object once per-partition when you go to do a mapPartitions across your dataset, loading from the central location.
___ From: Filli Alem [alem.fi...@ti8m.ch] Sent: Wednesday, July 29, 2015 12:04 PM To: user@spark.apache.org Subject: IP2Location within spark jobs Hi, I would like to use ip2Location databases during my spark jobs (MaxMind). So far I haven’t found a way to properly serialize the database offered by the Java API of the database. The CSV version isn’t easy to handle as it contains of multiple files. Any recommendations on how to do this? Thanks Alem --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org