Re: [EXTERNAL] Re: Unable to access Google buckets using spark-submit

2022-02-14 Thread Saurabh Gulati
ubject: [EXTERNAL] Re: Unable to access Google buckets using spark-submit Caution! This email originated outside of FedEx. Please do not open attachments or click links from an unknown or suspicious origin. Hi Gaurav, All, I'm doing a spark-submit from my local system to a GCP Dataproc c

Re: Unable to access Google buckets using spark-submit

2022-02-13 Thread karan alang
Hi Gaurav, All, I'm doing a spark-submit from my local system to a GCP Dataproc cluster .. This is more for dev/testing. I can run a -- 'gcloud dataproc jobs submit' command as well, which is what will be done in Production. Hope that clarifies. regds, Karan Alang On Sat, Feb 12, 2022 at 10:31

Re: Unable to access Google buckets using spark-submit

2022-02-13 Thread karan alang
Hi Holden, when you mention - GS Access jar - which jar is this ? Can you pls clarify ? thanks, Karan Alang On Sat, Feb 12, 2022 at 11:10 AM Holden Karau wrote: > You can also put the GS access jar with your Spark jars — that’s what the > class not found exception is pointing you towards. >

Re: Unable to access Google buckets using spark-submit

2022-02-13 Thread karan alang
Thanks, Mich - will check this and update. regds, Karan Alang On Sat, Feb 12, 2022 at 1:57 AM Mich Talebzadeh wrote: > BTW I also answered you in in stackoverflow : > > > https://stackoverflow.com/questions/71088934/unable-to-access-google-buckets-using-spark-submit > > HTH > > >view my

Re: Unable to access Google buckets using spark-submit

2022-02-13 Thread Mich Talebzadeh
Putting the GS access jar with Spark jars may technically resolve the issue of spark-submit but it is not a recommended practice to create a local copy of jar files. The approach that the thread owner adopted by putting the files in Google cloud bucket is correct. Indeed this is what he states

Re: Unable to access Google buckets using spark-submit

2022-02-12 Thread Gourav Sengupta
Hi, agree with Holden, have faced quite a few issues with FUSE. Also trying to understand "spark-submit from local" . Are you submitting your SPARK jobs from a local laptop or in local mode from a GCP dataproc / system? If you are submitting the job from your local laptop, there will be

Re: Unable to access Google buckets using spark-submit

2022-02-12 Thread Holden Karau
You can also put the GS access jar with your Spark jars — that’s what the class not found exception is pointing you towards. On Fri, Feb 11, 2022 at 11:58 PM Mich Talebzadeh wrote: > BTW I also answered you in in stackoverflow : > > >

Re: Unable to access Google buckets using spark-submit

2022-02-12 Thread Mich Talebzadeh
BTW I also answered you in in stackoverflow : https://stackoverflow.com/questions/71088934/unable-to-access-google-buckets-using-spark-submit HTH view my Linkedin profile https://en.everybodywiki.com/Mich_Talebzadeh

Re: Unable to access Google buckets using spark-submit

2022-02-12 Thread Mich Talebzadeh
You are trying to access a Google storage bucket gs:// from your local host. It does not see it because spark-submit assumes that it is a local file system on the host which is not. You need to mount gs:// bucket as a local file system. You can use the tool called gcsfuse