I have a sequence file

SEQorg.apache.hadoop.io.Textorg.apache.hadoop.io.Text'org.apache.hadoop.io.compress.GzipCodec?v?


Key = Text

Value = Text

and it seems to be using GzipCodec.

How should i read it from Spark

I am using

val x = sc.sequenceFile(dwTable, classOf[Text], classOf[Text]).partitionBy(
new org.apache.spark.HashPartitioner(7919))

When i do

x.take(10).foreach(println)

each record return is identical. How is that possible. In this Sequence
file records are unique. (guarenteed)

-- 
Deepak

Reply via email to