Re: Submitting job with external dependencies to pyspark

2020-01-28 Thread Tharindu Mathew
/docs/latest/submitting-applications.html#bundling-your-applications-dependencies > > I hope that helps. > > On Tue, 28 Jan 2020, 9:46 am Tharindu Mathew, > wrote: > >> Hi, >> >> Newbie to pyspark/spark here. >> >> I'm trying to submit a job to pyspark

Submitting job with external dependencies to pyspark

2020-01-27 Thread Tharindu Mathew
. -- Regards, Tharindu Mathew http://tharindumathew.com

Avoid RDD shuffling in a join after Distributed Matrix operation

2016-08-22 Thread Tharindu
hi, Just wanted to get your input how to avoid RDD shuffling in a join after Distributed Matrix operation spark Following is what my app would look like 1. created a dense matrix as a input to calculate cosine distance between columns val rowMarixIn = sc.textFile("input.csv").map{ line

Fwd: How to avoid RDD shuffling in join after Distributed Matrix calculation

2016-08-22 Thread Tharindu Thundeniya
d be much appreciated Thanks, Tharindu