cdh 5.7.1. pyspark. 

codes: ===================
from pyspark import SparkContext, SparkConf

conf = SparkConf().setAppName('s3 ---')
sc = SparkContext(conf=conf)

myRdd =
sc.textFile("s3n://****/y=2016/m=5/d=26/h=20/2016.5.26.21.9.52.6d53180a-28b9-4e65-b749-b4a2694b9199.json.gz")

count = myRdd.count()
print "The count is", count

=======================
standalone mode: command line:

AWS_ACCESS_KEY_ID=??? AWS_SECRET_ACCESS_KEY=??? ./bin/spark-submit
--driver-memory 4G  --master  spark://master:7077 --conf
"spark.default.parallelism=70"  /root/workspace/test/s3.py

Error:
)
16/07/27 17:27:26 INFO spark.SparkContext: Created broadcast 0 from textFile
at NativeMethodAccessorImpl.java:-2
Traceback (most recent call last):
  File "/root/workspace/test/s3.py", line 12, in <module>
    count = myRdd.count()
  File
"/opt/cloudera/parcels/CDH-5.7.1-1.cdh5.7.1.p0.11/lib/spark/python/lib/pyspark.zip/pyspark/rdd.py",
line 1004, in count
  File
"/opt/cloudera/parcels/CDH-5.7.1-1.cdh5.7.1.p0.11/lib/spark/python/lib/pyspark.zip/pyspark/rdd.py",
line 995, in sum
  File
"/opt/cloudera/parcels/CDH-5.7.1-1.cdh5.7.1.p0.11/lib/spark/python/lib/pyspark.zip/pyspark/rdd.py",
line 869, in fold
  File
"/opt/cloudera/parcels/CDH-5.7.1-1.cdh5.7.1.p0.11/lib/spark/python/lib/pyspark.zip/pyspark/rdd.py",
line 771, in collect
  File
"/opt/cloudera/parcels/CDH-5.7.1-1.cdh5.7.1.p0.11/lib/spark/python/lib/py4j-0.9-src.zip/py4j/java_gateway.py",
line 813, in __call__
  File
"/opt/cloudera/parcels/CDH-5.7.1-1.cdh5.7.1.p0.11/lib/spark/python/lib/py4j-0.9-src.zip/py4j/protocol.py",
line 308, in get_return_value
py4j.protocol.Py4JJavaError: An error occurred while calling
z:org.apache.spark.api.python.PythonRDD.collectAndServe.
: java.lang.VerifyError: Bad type on operand stack
Exception Details:
  Location:
   
org/apache/hadoop/fs/s3native/Jets3tNativeFileSystemStore.copy(Ljava/lang/String;Ljava/lang/String;)V
@155: invokevirtual
  Reason:
    Type 'org/jets3t/service/model/S3Object' (current frame, stack[4]) is
not assignable to 'org/jets3t/service/model/StorageObject'
  Current Frame:
    bci: @155
    flags: { }
    locals: { 'org/apache/hadoop/fs/s3native/Jets3tNativeFileSystemStore',
'java/lang/String', 'java/lang/String', 'org/jets3t/service/model/S3Object'
}
    stack: { 'org/jets3t/service/S3Service', 'java/lang/String',
'java/lang/String', 'java/lang/String', 'org/jets3t/service/model/S3Object',
integer }
  Bytecode:
    0000000: b200 fcb9 0190 0100 9900 39b2 00fc bb01
    0000010: 5659 b701 5713 0192 b601 5b2b b601 5b13
    0000020: 0194 b601 5b2c b601 5b13 0196 b601 5b2a
    0000030: b400 7db6 00e7 b601 5bb6 015e b901 9802
    0000040: 002a b400 5799 0030 2ab4 0047 2ab4 007d
    0000050: 2b01 0101 01b6 019b 4e2a b400 6b09 949e
    0000060: 0016 2db6 019c 2ab4 006b 949e 000a 2a2d
    0000070: 2cb6 01a0 b1bb 00a0 592c b700 a14e 2d2a
    0000080: b400 73b6 00b0 2ab4 0047 2ab4 007d b600
    0000090: e72b 2ab4 007d b600 e72d 03b6 01a4 57a7
    00000a0: 000a 4e2a 2d2b b700 c7b1               
  Exception Handler Table:
    bci [0, 116] => handler: 162
    bci [117, 159] => handler: 162
  Stackmap Table:
    same_frame_extended(@65)
    same_frame(@117)
    same_locals_1_stack_item_frame(@162,Object[#139])
    same_frame(@169)

        at
org.apache.hadoop.fs.s3native.NativeS3FileSystem.createDefaultStore(NativeS3FileSystem.java:338)
        at
org.apache.hadoop.fs.s3native.NativeS3FileSystem.initialize(NativeS3FileSystem.java:328)

.....

TIA




--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/spark-1-6-0-read-s3-files-error-tp27417.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Reply via email to