Hi Pradeep,

I'm afraid you're running into a hard Java issue. Strings are indexed
with signed integers and can therefore not be longer than
approximately 2 billion characters. Could you use `textFile` as a
workaround? It will give you an RDD of the files' lines instead.

In general, this guide http://spark.apache.org/contributing.html gives
information on how to contribute to spark, including instructions on
how to file bug reports (which does not apply in this case as it isn't
a bug in Spark).

regards,
--Jakob

On Mon, Dec 12, 2016 at 7:34 PM, Pradeep <pradeep.mi...@mail.com> wrote:
> Hi,
>
> Why there is an restriction on max file size that can be read by 
> wholeTextFile() method.
>
> I can read a 1.5 gigs file but get Out of memory for 2 gig file.
>
> Also, how can I raise this as an defect in spark jira. Can someone please 
> guide.
>
> Thanks,
> Pradeep
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Reply via email to