Re: Use SQL Script to Write Spark SQL Jobs

ayan guha Mon, 12 Jun 2017 23:27:07 -0700

Hi

IMHO, this approach is not very useful.

Firstly, 2 use cases mentioned in the project page:

1. Simplify spark development - I think the only thing can be done there is
to come up with some boilerplate function, which essentially will take a
sql and come back with a temp table name and a corresponding DF (Remember
the project targets structured data sources only, not streaming or RDD).
Building another mini-DSL on top of already fairly elaborate spark API
never appealed to me.

2. Business Analysts using Spark - single word answer is Notebooks. Take
your pick - Jupyter, Zeppelin, Hue.

The case of "Spark is for Developers", IMHO, stemmed to the
packaging/building overhead of spark apps. For Python users, this barrier
is considerably lower (And maybe that is why I do not see a prominent
need).

But I can imagine the pain of a SQL developer coming into a scala/java
world. I came from a hardcore SQL/DWH environment where I used to write SQL
and SQL only. So SBT or MVN are still not my friend. Maybe someday they
will. But learned them hard way, just because the value of using spark can
offset the pain long long way. So, I think there is a need of spending time
with the environment to get comfortable with it. And maybe, just maybe,
using Nifi in case you miss drag/drop features too much :)

But, these are my 2c, and sincerely humble opinion, and I wish you all the
luck for your project.

On Tue, Jun 13, 2017 at 3:23 PM, Benjamin Kim <bbuil...@gmail.com> wrote:

> Hi Bo,
>
> +1 for your project. I come from the world of data warehouses, ETL, and
> reporting analytics. There are many individuals who do not know or want to
> do any coding. They are content with ANSI SQL and stick to it. ETL
> workflows are also done without any coding using a drag-and-drop user
> interface, such as Talend, SSIS, etc. There is a small amount of scripting
> involved but not too much. I looked at what you are trying to do, and I
> welcome it. This could open up Spark to the masses and shorten development
> times.
>
> Cheers,
> Ben
>
>
> On Jun 12, 2017, at 10:14 PM, bo yang <bobyan...@gmail.com> wrote:
>
> Hi Aakash,
>
> Thanks for your willing to help :) It will be great if I could get more
> feedback on my project. For example, is there any other people feeling the
> need of using a script to write Spark job easily? Also, I would explore
> whether it is possible that the Spark project takes some work to build such
> a script based high level DSL.
>
> Best,
> Bo
>
>
> On Mon, Jun 12, 2017 at 12:14 PM, Aakash Basu <aakash.spark....@gmail.com>
> wrote:
>
>> Hey,
>>
>> I work on Spark SQL and would pretty much be able to help you in this.
>> Let me know your requirement.
>>
>> Thanks,
>> Aakash.
>>
>> On 12-Jun-2017 11:00 AM, "bo yang" <bobyan...@gmail.com> wrote:
>>
>>> Hi Guys,
>>>
>>> I am writing a small open source project
>>> <https://github.com/uber/uberscriptquery> to use SQL Script to write
>>> Spark Jobs. Want to see if there are other people interested to use or
>>> contribute to this project.
>>>
>>> The project is called UberScriptQuery (https://githu
>>> b.com/uber/uberscriptquery). Sorry for the dumb name to avoid conflict
>>> with many other names (Spark is registered trademark, thus I could not use
>>> Spark in my project name).
>>>
>>> In short, it is a high level SQL-like DSL (Domain Specific Language) on
>>> top of Spark. People can use that DSL to write Spark jobs without worrying
>>> about Spark internal details. Please check README
>>> <https://github.com/uber/uberscriptquery> in the project to get more
>>> details.
>>>
>>> It will be great if I could get any feedback or suggestions!
>>>
>>> Best,
>>> Bo
>>>
>>>
>
>

-- 
Best Regards,
Ayan Guha

Re: Use SQL Script to Write Spark SQL Jobs

Reply via email to