ArrayIndexOutOfBoundsException when reading bzip2 files

2014-06-09 Thread MEETHU MATHEW
Hi, I am getting ArrayIndexOutOfBoundsException while reading from bz2 files  in HDFS.I have come across the same issue in JIRA at  https://issues.apache.org/jira/browse/SPARK-1861, but it seems to be resolved.  I have tried the workaround suggested(SPARK_WORKER_CORES=1),but its still showing

Re: ArrayIndexOutOfBoundsException when reading bzip2 files

2014-06-09 Thread Akhil Das
Can you paste the piece of code!? Thanks Best Regards On Mon, Jun 9, 2014 at 5:24 PM, MEETHU MATHEW meethu2...@yahoo.co.in wrote: Hi, I am getting ArrayIndexOutOfBoundsException while reading from bz2 files in HDFS.I have come across the same issue in JIRA at

Re: ArrayIndexOutOfBoundsException when reading bzip2 files

2014-06-09 Thread MEETHU MATHEW
Hi Akhil, Plz find the code below.  x = sc.textFile(hdfs:///**)  x = x.filter(lambda z:z.split(,)[0]!=' ')  x = x.filter(lambda z:z.split(,)[3]!=' ')  z = x.reduce(add)   Thanks Regards, Meethu M On Monday, 9 June 2014 5:52 PM, Akhil Das ak...@sigmoidanalytics.com wrote: Can you paste

Re: ArrayIndexOutOfBoundsException when reading bzip2 files

2014-06-09 Thread Sean Owen
Have a search online / at the Spark JIRA. This was a known upstream bug in Hadoop. https://issues.apache.org/jira/browse/SPARK-1861 On Mon, Jun 9, 2014 at 7:54 AM, MEETHU MATHEW meethu2...@yahoo.co.in wrote: Hi, I am getting ArrayIndexOutOfBoundsException while reading from bz2 files in

Re: ArrayIndexOutOfBoundsException when reading bzip2 files

2014-06-09 Thread MEETHU MATHEW
Hi Sean, Thank you for the fast response.   Thanks Regards, Meethu M On Monday, 9 June 2014 6:04 PM, Sean Owen so...@cloudera.com wrote: Have a search online / at the Spark JIRA. This was a known upstream bug in Hadoop. https://issues.apache.org/jira/browse/SPARK-1861 On Mon, Jun 9,

Re: ArrayIndexOutOfBoundsException when reading bzip2 files

2014-06-09 Thread sam
/ArrayIndexOutOfBoundsException-when-reading-bzip2-files-tp7237p7263.html Sent from the Apache Spark User List mailing list archive at Nabble.com.