Hi Nihed,

Interesting to see envelope. The idea is same there! Thanks for the sharing
:)

Best,
Bo


On Wed, Jun 14, 2017 at 12:22 AM, nihed mbarek <nihe...@gmail.com> wrote:

> Hi
>
> I already saw a project with the same idea.
> https://github.com/cloudera-labs/envelope
>
> Regards,
>
> On Wed, 14 Jun 2017 at 04:32, bo yang <bobyan...@gmail.com> wrote:
>
>> Thanks Benjamin and Ayan for the feedback! You kind of represent two
>> group of people who need such script tool or not. Personally I find the
>> script is very useful for myself to write ETL pipelines and daily jobs.
>> Let's see whether there are other people interested in such project.
>>
>> Best,
>> Bo
>>
>>
>>
>>
>>
>> On Mon, Jun 12, 2017 at 11:26 PM, ayan guha <guha.a...@gmail.com> wrote:
>>
>>> Hi
>>>
>>> IMHO, this approach is not very useful.
>>>
>>> Firstly, 2 use cases mentioned in the project page:
>>>
>>> 1. Simplify spark development - I think the only thing can be done there
>>> is to come up with some boilerplate function, which essentially will take a
>>> sql and come back with a temp table name and a corresponding DF (Remember
>>> the project targets structured data sources only, not streaming or RDD).
>>> Building another mini-DSL on top of already fairly elaborate spark API
>>> never appealed to me.
>>>
>>> 2. Business Analysts using Spark - single word answer is Notebooks. Take
>>> your pick - Jupyter, Zeppelin, Hue.
>>>
>>> The case of "Spark is for Developers", IMHO, stemmed to the
>>> packaging/building overhead of spark apps. For Python users, this barrier
>>> is considerably lower (And maybe that is why I do not see a prominent
>>> need).
>>>
>>> But I can imagine the pain of a SQL developer coming into a scala/java
>>> world. I came from a hardcore SQL/DWH environment where I used to write SQL
>>> and SQL only. So SBT or MVN are still not my friend. Maybe someday they
>>> will. But learned them hard way, just because the value of using spark can
>>> offset the pain long long way. So, I think there is a need of spending time
>>> with the environment to get comfortable with it. And maybe, just maybe,
>>> using Nifi in case you miss drag/drop features too much :)
>>>
>>> But, these are my 2c, and sincerely humble opinion, and I wish you all
>>> the luck for your project.
>>>
>>> On Tue, Jun 13, 2017 at 3:23 PM, Benjamin Kim <bbuil...@gmail.com>
>>> wrote:
>>>
>>>> Hi Bo,
>>>>
>>>> +1 for your project. I come from the world of data warehouses, ETL, and
>>>> reporting analytics. There are many individuals who do not know or want to
>>>> do any coding. They are content with ANSI SQL and stick to it. ETL
>>>> workflows are also done without any coding using a drag-and-drop user
>>>> interface, such as Talend, SSIS, etc. There is a small amount of scripting
>>>> involved but not too much. I looked at what you are trying to do, and I
>>>> welcome it. This could open up Spark to the masses and shorten development
>>>> times.
>>>>
>>>> Cheers,
>>>> Ben
>>>>
>>>>
>>>> On Jun 12, 2017, at 10:14 PM, bo yang <bobyan...@gmail.com> wrote:
>>>>
>>>> Hi Aakash,
>>>>
>>>> Thanks for your willing to help :) It will be great if I could get more
>>>> feedback on my project. For example, is there any other people feeling the
>>>> need of using a script to write Spark job easily? Also, I would explore
>>>> whether it is possible that the Spark project takes some work to build such
>>>> a script based high level DSL.
>>>>
>>>> Best,
>>>> Bo
>>>>
>>>>
>>>> On Mon, Jun 12, 2017 at 12:14 PM, Aakash Basu <
>>>> aakash.spark....@gmail.com> wrote:
>>>>
>>>>> Hey,
>>>>>
>>>>> I work on Spark SQL and would pretty much be able to help you in this.
>>>>> Let me know your requirement.
>>>>>
>>>>> Thanks,
>>>>> Aakash.
>>>>>
>>>>> On 12-Jun-2017 11:00 AM, "bo yang" <bobyan...@gmail.com> wrote:
>>>>>
>>>>>> Hi Guys,
>>>>>>
>>>>>> I am writing a small open source project
>>>>>> <https://github.com/uber/uberscriptquery> to use SQL Script to write
>>>>>> Spark Jobs. Want to see if there are other people interested to use or
>>>>>> contribute to this project.
>>>>>>
>>>>>> The project is called UberScriptQuery (https://github.com/uber/
>>>>>> uberscriptquery). Sorry for the dumb name to avoid conflict with
>>>>>> many other names (Spark is registered trademark, thus I could not use 
>>>>>> Spark
>>>>>> in my project name).
>>>>>>
>>>>>> In short, it is a high level SQL-like DSL (Domain Specific Language)
>>>>>> on top of Spark. People can use that DSL to write Spark jobs without
>>>>>> worrying about Spark internal details. Please check README
>>>>>> <https://github.com/uber/uberscriptquery> in the project to get more
>>>>>> details.
>>>>>>
>>>>>> It will be great if I could get any feedback or suggestions!
>>>>>>
>>>>>> Best,
>>>>>> Bo
>>>>>>
>>>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> Best Regards,
>>> Ayan Guha
>>>
>>
>> --
>
> M'BAREK Med Nihed,
> Fedora Ambassador, TUNISIA, Northern Africa
> http://www.nihed.com
>
> <http://tn.linkedin.com/in/nihed>
>
>

Reply via email to