Hi John,
I think it relates to drivers memory more than the others thing you said.

Can you just increase more memory for driver?




> On Jul 1, 2016, at 9:03 AM, johnzeng <jo...@fossil.com> wrote:
> 
> I am trying to load a 1 TB collection into spark cluster from mongo. But I am
> keep getting stack overflow error  after running for a while.
> 
> I have posted a question in stackoverflow.com, and tried all advies they
> have provide, nothing works...
> 
> how to load large database into spark
> <http://stackoverflow.com/questions/38096502/how-to-load-large-table-in-spark>
>   
> 
> I have tried:
> 1, use persist to make it MemoryAndDisk,  same error after running same
> time.
> 2, add more instance,  same error after running same time.
> 3, run this script on another collection which is much smaller, everything
> is good, so I think my codes are all right.
> 4, remove the reduce process, same error after running same time.
> 5, remove the map process,  same error after running same time.
> 6, change the sql I used, it's faster, but  same error after running shorter
> time.
> 7,retrieve "_id" instead of "u_at" and "c_at",  same error after running
> same time.
> 
> Anyone knows how many resources do I need to handle this 1TB database? I
> only retrieve two fields form it, and this field is only 1% of a
> document(because we have an array containing about 90+ embedded documents in
> it.)
> 
> 
> 
> --
> View this message in context: 
> http://apache-spark-user-list.1001560.n3.nabble.com/Looking-for-help-about-stackoverflow-in-spark-tp27255.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
> 
> ---------------------------------------------------------------------
> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
> 


---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Reply via email to