At 2014-12-04 16:26:50 -0800, spr <s...@yarcdata.com> wrote: > I'm also looking at how to represent literals as vertex properties. It seems > one way to do this is via positional convention in an Array/Tuple/List that is > the VD; i.e., to represent height, weight, and eyeColor, the VD could be a > Tuple3(Double, Double, String). > [...] > Given that vertices can have many many properties, it seems memory consumption > for the properties should be as parsimonious as possible. Will any of > Array/Tuple/List support sparse usage? Is Option the way to get there?
Storing vertex properties positionally with Array[Option[Any]] or any of the other sequence types will provide a dense representation. For a sparse representation, the right data type is a Map[String, Any], which will let you access properties by name and will only store the nonempty properties. Since the value type in the map has to be Any, or more precisely the least upper bound of the property types, this sacrifices type safety and you'll have to downcast when retrieving properties. If there are particular subsets of the properties that frequently go together, you could instead use a class hierarchy. For example, if the vertices are either people or products, you could use the following: sealed trait VertexProperty extends Serializable case class Person(name: String, weight: Int) extends VertexProperty case class Product(name: String, price: Int) extends VertexProperty Then you could pattern match against the hierarchy instead of downcasting: List(Person("Bob", 180), Product("chair", 800), Product("desk", 200)).flatMap { case Person(name, weight) => Array.empty[Int] case Product(name, price) => Array(price) }.sum Ankur --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org