Disclaimer: I'm developer avocado for data engineering at JetBrains, so I'm
definitely biased.

And if someone likes Zeppelin — there is an awesome integration of Zeppelin
into IDEA via Big Data Tools plugin — one can perform any explorations they
want/need and then extract all their work into real code with a simple
refactoring → extract Spark Job.

--
Regards,
Pasha

сб, 2 окт. 2021 г. в 04:03, Holden Karau <hol...@pigscanfly.ca>:

> Personally I like Jupyter notebooks for my interactive work and then once
> I’ve done my exploration I switch back to emacs with either scala-metals or
> Python mode.
>
> I think the main takeaway is: do what feels best for you, there is no one
> true way to develop in Spark.
>
> On Fri, Oct 1, 2021 at 1:28 AM Mich Talebzadeh <mich.talebza...@gmail.com>
> wrote:
>
>> Thanks guys for your comments.
>>
>> I agree with you Florian that opening a terminal say in VSC allows you to
>> run a shell script (an sh file) to submit your spark code, however, this
>> really makes sense if your IDE is running on a Linux host submitting a job
>> to a Kubernetes cluster or YARN cluster.
>>
>> For Python, I will go with PyCharm which is specific to the Python world.
>> With Spark, I have used IntelliJ with Spark plug in on MAC for development
>> work. Then created a JAR file, gzipped the whole project and scped to an
>> IBM sandbox, untarred it and ran it with a pre-prepared shell with
>> environment plugin for dev, test, staging etc.
>>
>> IDE is also useful for looking at csv, tsv type files or creating json
>> from one form to another. For json validation,especially if the file is too
>> large, you may have restriction loading the file to web json validator
>> because of the risk of proprietary data being exposed. There is a tool
>> called jq <https://stedolan.github.io/jq/> (a lightweight and flexible
>> command-line JSON processor), that comes pretty handy to validate json.
>> Download and install it on OS and run it as
>>
>> zcat <json_file>.tgz | jq
>>
>> That will validate the whole tarred and gzipped json file. Otherwise most
>> of these IDE tools come with add-on plugins, for various needs. My
>> preference would be to use the best available IDE for the job. VSC I would
>> consider as a general purpose tool. If all fails, one can always use OS
>> stuff like vi, vim, sed, awk etc 🤔
>>
>>
>> Cheers
>>
>>
>>    view my Linkedin profile
>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
>>
>>
>>
>> *Disclaimer:* Use it at your own risk. Any and all responsibility for
>> any loss, damage or destruction of data or any other property which may
>> arise from relying on this email's technical content is explicitly
>> disclaimed. The author will in no case be liable for any monetary damages
>> arising from such loss, damage or destruction.
>>
>>
>>
>>
>> On Fri, 1 Oct 2021 at 06:55, Florian CASTELAIN <
>> florian.castel...@redlab.io> wrote:
>>
>>> Hello.
>>>
>>> Any "evolved" code editor allows you to create tasks (or builds, or
>>> whatever they are called in the IDE you chose). If you do not find anything
>>> that packages by default all you need, you could just create your own tasks.
>>>
>>>
>>> *For yarn, one needs to open a terminal and submit from there. *
>>>
>>> You can create task(s) that launch your yarn commands.
>>>
>>>
>>> *With VSC, you get stuff for working with json files but I am not sure
>>> with a plugin for Python *
>>>
>>> In your json task configuration, you can launch whatever you want:
>>> python, shell. I bet you could launch your favorite video game (just make a
>>> task called "let's have a break" 😉)
>>>
>>> Just to say, if you want everything exactly the way you want, I do not
>>> think you will find an IDE that does it. You will have to customize it.
>>> (correct me if wrong, of course).
>>>
>>> Have a good day.
>>>
>>> *[image: signature_299490615]* <https://www.neuroo.ai/>
>>>
>>>
>>>
>>> [image: Banner] <http://www.redlab.io/>
>>>
>>>
>>>
>>> *Florian CASTELAIN *
>>> *Ingénieur Logiciel*
>>>
>>> 72 Rue de la République, 76140 Le Petit-Quevilly
>>> <https://www.google.com/maps/search/72+Rue+de+la+R%C3%A9publique,+76140+Le+Petit-Quevilly?entry=gmail&source=g>
>>> m: +33 616 530 226
>>> e: florian.castel...@redlab.io w: www.redlab.io
>>>
>>> ------------------------------
>>> *De :* Jeff Zhang <zjf...@gmail.com>
>>> *Envoyé :* jeudi 30 septembre 2021 13:57
>>> *À :* Mich Talebzadeh <mich.talebza...@gmail.com>
>>> *Cc :* user @spark <user@spark.apache.org>
>>> *Objet :* Re: Choice of IDE for Spark
>>>
>>> IIRC, you want an IDE for pyspark on yarn ?
>>>
>>> Mich Talebzadeh <mich.talebza...@gmail.com> 于2021年9月30日周四 下午7:00写道:
>>>
>>> Hi,
>>>
>>> This may look like a redundant question but it comes about because of
>>> the advent of Cloud workstation usage like Amazon workspaces and others.
>>>
>>> With IntelliJ you are OK with Spark & Scala. With PyCharm you are fine
>>> with PySpark and the virtual environment. Mind you as far as I know PyCharm
>>> only executes spark-submit in local mode. For yarn, one needs to open a
>>> terminal and submit from there.
>>>
>>> However, in Amazon workstation, you get Visual Studio Code
>>> <https://code.visualstudio.com/> (VSC, an MS product) and openoffice
>>> installed. With VSC, you get stuff for working with json files but I am not
>>> sure with a plugin for Python etc, will it be as good as PyCharm? Has
>>> anyone used VSC in anger for Spark and if so what is the experience?
>>>
>>> Thanks
>>>
>>>
>>>
>>>    view my Linkedin profile
>>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
>>>
>>>
>>>
>>> *Disclaimer:* Use it at your own risk. Any and all responsibility for
>>> any loss, damage or destruction of data or any other property which may
>>> arise from relying on this email's technical content is explicitly
>>> disclaimed. The author will in no case be liable for any monetary damages
>>> arising from such loss, damage or destruction.
>>>
>>>
>>>
>>>
>>>
>>> --
>>> Best Regards
>>>
>>> Jeff Zhang
>>>
>> --
> Twitter: https://twitter.com/holdenkarau
> Books (Learning Spark, High Performance Spark, etc.):
> https://amzn.to/2MaRAG9  <https://amzn.to/2MaRAG9>
> YouTube Live Streams: https://www.youtube.com/user/holdenkarau
>

Reply via email to