I want to process in Spark large numbers, for example 160 bits. I could store
them as an array of ints or as a java.util.BitSet or something with
compression like https://github.com/lemire/javaewah or
https://github.com/RoaringBitmap/RoaringBitmap. My question is what should I
use so that Spark works the fastest (the operations that I do on those
numbers are computational light). Should my priority by:
1. that the fewest objects are created and basic types are used the most (is
array of ints good for that?)
2. or that the representation consumes the fewest bytes for example by doing
its own compression
3. or maybe it is possible to combine 1 and 2, by using some some kind of
native Spark compression

Thank you in advance forĀ all help,
Regards Zephod



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/large-number-representation-tp28297.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Reply via email to