xukaiqiang created SPARK-11519:
----------------------------------

             Summary: Spark MemoryStore with hadoop SequenceFile cache the 
values is same record.
                 Key: SPARK-11519
                 URL: https://issues.apache.org/jira/browse/SPARK-11519
             Project: Spark
          Issue Type: Bug
    Affects Versions: 1.1.0
         Environment: jdk.1.7.0, spark1.1.0, hadoop2.3.0
            Reporter: xukaiqiang


use spark create newAPIHadoopFile which is SequenceFile format, when use spark 
memory cache, the cache save the same java object .

read  hadoop file with SequenceFileRecordReader save as NewHadoopRDD. the kv 
values as  :
[1, com.data.analysis.domain.RecordObject@54cdb594]
[2, com.data.analysis.domain.RecordObject@54cdb594]
[3, com.data.analysis.domain.RecordObject@54cdb594]
although the value is the same java object , but i am sure the context is not 
same .
jsut use spark memory cache, the  MemoryStore vector save all records, but the 
value is the last vlaue from newHadoopRDD.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to