Itsuki Toyota created JENA-1233:
-----------------------------------

             Summary: Make RDF primitives Serializable
                 Key: JENA-1233
                 URL: https://issues.apache.org/jira/browse/JENA-1233
             Project: Apache Jena
          Issue Type: Improvement
          Components: Elephas
    Affects Versions: Jena 3.1.0
            Reporter: Itsuki Toyota


I always use Jena when I handle RDF data with Apache Spark.
However, when I want to store resulting RDD data (ex. RDD[Triple]) in binary 
format, I can't call RDD.saveAsObjectFile method.
It's because RDD.saveAsObjectFile requires java.io.Serializable interface.

See the following code. 
https://github.com/apache/spark/blob/v1.6.0/core/src/main/scala/org/apache/spark/rdd/RDD.scala#L1469
https://github.com/apache/spark/blob/v1.6.0/core/src/main/scala/org/apache/spark/util/Utils.scala#L79-L86

You can see that 
1) RDD.saveAsObjectFile calls Util.serialize method
2) Util.serialize method requires the RDD-wrapped object implementing 
java.io.Serializable interface. For example, if you want to save a RDD[Triple] 
object, Triple must implements java.io.Serializable.

So why not implement java.io.Serializable ?
I think it will improve the usability in Apache Spark.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to