At 2014-12-04 16:26:50 -0800, spr <s...@yarcdata.com> wrote:
> I'm also looking at how to represent literals as vertex properties. It seems
> one way to do this is via positional convention in an Array/Tuple/List that is
> the VD; i.e., to represent height, weight, and eyeColor, the VD could be a
> Tuple3(Double, Double, String).
> [...]
> Given that vertices can have many many properties, it seems memory consumption
> for the properties should be as parsimonious as possible. Will any of
> Array/Tuple/List support sparse usage? Is Option the way to get there?

Storing vertex properties positionally with Array[Option[Any]] or any of the 
other sequence types will provide a dense representation. For a sparse 
representation, the right data type is a Map[String, Any], which will let you 
access properties by name and will only store the nonempty properties.

Since the value type in the map has to be Any, or more precisely the least 
upper bound of the property types, this sacrifices type safety and you'll have 
to downcast when retrieving properties. If there are particular subsets of the 
properties that frequently go together, you could instead use a class 
hierarchy. For example, if the vertices are either people or products, you 
could use the following:

    sealed trait VertexProperty extends Serializable
    case class Person(name: String, weight: Int) extends VertexProperty
    case class Product(name: String, price: Int) extends VertexProperty

Then you could pattern match against the hierarchy instead of downcasting:

     List(Person("Bob", 180), Product("chair", 800), Product("desk", 
200)).flatMap {
       case Person(name, weight) => Array.empty[Int]
       case Product(name, price) => Array(price)
     }.sum

Ankur

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to