Hi Owen Thank you for reply
I heard that some peoples say ORC is Owen’s RC file haha ;)
And, Some peoples tells to me after posting it’s already known issues about AWS
EMR 4.0.0
They said that it might be Hive 0.13.1 and Spark 1.4.1 compatibility issue
So AWS will launch EMR 4.1.0 in couple
Do you have a stack trace of the array out of bounds exception? I don't
remember an array out of bounds problem off the top of my head. A stack
trace will tell me a lot, obviously.
If you are using Spark 1.4 that implies Hive 0.13, which is pretty old. It
may be a problem that we fixed a while
xecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Question-ORC-EMRFS-Problem-tp24673p24675.html
Sent from the Apache Spark User List mailing list archive at Nabble.c
ext:
http://apache-spark-user-list.1001560.n3.nabble.com/Question-ORC-EMRFS-Problem-tp24673.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For
Good Day!
I think there are some problems between ORC and AWS EMRFS.
When I was trying to read "upper 150M" ORC files from S3, ArrayOutOfIndex
Exception occured.
I'm sure that it's AWS side issue because there was no exception when trying
from HDFS or S3NativeFileSystem.
Parquet runs