Thanks for the suggestions. I ended up switching to jdk 1.7+ just to make
the code more readable. I will take a look at the EWAH implementation as
well.
Jim
On Sun, May 12, 2013 at 3:40 PM, Bertrand Dechoux wrote:
> You can disregard my links as their are only valid for java 1.7+.
> The JavaSer
You can disregard my links as their are only valid for java 1.7+.
The JavaSerialization might clean your code but shouldn't bring a
significant boost in performance.
The EWAH implementation has, at least, the methods you are looking for :
serialize / deserialize.
Regards
Bertrand
Note to myself
Another interesting alternative is the EWAH implementation of java bitsets
that allow efficient compressed bitsets with very fast OR operations.
https://github.com/lemire/javaewah
See also https://code.google.com/p/sparsebitmap/ by the same authors.
On Sun, May 12, 2013 at 1:11 PM, Bertrand Dec
In order to make the code more readable, you could start by using the
methods toByteArray() and valueOf(bytes)
http://docs.oracle.com/javase/7/docs/api/java/util/BitSet.html#toByteArray%28%29
http://docs.oracle.com/javase/7/docs/api/java/util/BitSet.html#valueOf%28byte[]%29
Regards
Bertrand
On
You can perhaps consider using the experimental JavaSerialization [1]
enhancement to skip transforming to
Writables/other-serialization-formats. It may be slower but looks like
you are looking for a way to avoid transforming objects.
Enable by adding the class
org.apache.hadoop.io.serializer.JavaS
I have large java.util.BitSet objects that I want to bitwise-OR using a
MapReduce job. I decided to wrap around each object using the Writable
interface. Right now I convert each BitSet to a byte array and serialize
the byte array on disk.
Converting them to byte arrays is a bit inefficient but I