We have a huge binary file in a custom serialization format (e.g. header
tells the length of the record, then there is a varying number of items for
that record). This is produced by an old c++ application.
What would be best approach to deserialize it into a Hive table or a Spark
RDD?
Format is known and well documented.


-- 
Ruslan Dautkhanov

Reply via email to