Efficient Key Structure in pairRDD

nsareen Sun, 09 Nov 2014 23:40:38 -0800

Hi,

We are trying to adopt Spark for our application.


We have an analytical application which stores data in Star Schemas ( SQL
Server ). All the cubes are loaded into a Key / Value structure and saved in
Trove ( in memory collection ). here key is a short array where each short
number represents a dimension member. 
e.g
Tuple = CampaignX,Product1,Region_south,10.23232 gets converted to 
Trove Key[[12322],[45232],[53421]] & Value[10.23232].

This is done to avoid saving collection of string objects as key in Trove.

Now can we save this data structure in Spark using pairRDD? & if yes, will
key value be an ideal way of storing data in spark and retrieving it for
data analysis, or is there any other better data structure we can  create,
which would help us create and process RDD ?

Nitin.



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Efficient-Key-Structure-in-pairRDD-tp18461.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Efficient Key Structure in pairRDD

Reply via email to