Re: Zeppelin Integration

ayan guha Wed, 23 Mar 2016 04:08:09 -0700

Hi All

After spending few more days with the issue, I finally found the issue
listed in Spark Jira - https://issues.apache.org/jira/browse/SPARK-8659


I would love to know if there are any roadmap for this? Maybe someone from
dev group can confirm?

Thank you in advance

Best
Ayan

On Thu, Mar 10, 2016 at 10:32 PM, ayan guha <guha.a...@gmail.com> wrote:

> Thanks guys for reply. Yes, Zeppelin with Spark is pretty compelling
> choice, for single user. Any pointers for using Zeppelin for multi user
> scenario? In essence, can we either (a) Use Zeppelin to connect to a long
> running Spark Application which has some pre-cached Dataframes? (b) Can
> Zeppelin user be passed down and use Ranger to implement Hive RBAC?
>
> I know I am sounding a little vague, but such is the problem state in my
> mind :) Any help will be appreciated.
>
> Best
> Ayan
>
> On Thu, Mar 10, 2016 at 9:51 PM, Mich Talebzadeh <
> mich.talebza...@gmail.com> wrote:
>
>> Zeppelin is pretty a good choice for Spark. It has a UI that allows you
>> to run your code. It has Interpreter where you change the connection
>> configuration. I made mine run on port 21999 (a deamon process on Linux
>> host where your spark master is running). It is pretty easy to set up and
>> run.
>>
>> HTH
>>
>> Dr Mich Talebzadeh
>>
>>
>>
>> LinkedIn * 
>> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>>
>>
>>
>> http://talebzadehmich.wordpress.com
>>
>>
>>
>> On 10 March 2016 at 10:26, Sabarish Sasidharan <
>> sabarish.sasidha...@manthan.com> wrote:
>>
>>> I believe you need to co-locate your Zeppelin on the same node where
>>> Spark is installed. You need to specify the SPARK HOME. The master I used
>>> was YARN.
>>>
>>> Zeppelin exposes a notebook interface. A notebook can have many
>>> paragraphs. You run the paragraphs. You can mix multiple contexts in the
>>> same notebook. So first paragraph can be scala, second can be sql that uses
>>> DF from first paragraph etc. If you use a select query, the output is
>>> automatically displayed as a chart.
>>>
>>> As RDDs are bound to the context that creates them, I don't think
>>> Zeppelin can use those RDDs.
>>>
>>> I don't know if notebooks can be reused within other notebooks. It would
>>> be a nice way of doing some common preparatory work (like building these
>>> RDDs).
>>>
>>> Regards
>>> Sab
>>>
>>> On Thu, Mar 10, 2016 at 2:28 PM, ayan guha <guha.a...@gmail.com> wrote:
>>>
>>>> Hi All
>>>>
>>>> I am writing this in order to get a fair understanding of how zeppelin
>>>> can be integrated with Spark.
>>>>
>>>> Our use case is to load few tables from a DB to Spark, run some
>>>> transformation. Once done, we want to expose data through Zeppelin for
>>>> analytics. I have few question around that to sound off any gross
>>>> architectural flaws.
>>>>
>>>> Questions:
>>>>
>>>> 1. How Zeppelin connects to Spark? Thriftserver? Thrift JDBC?
>>>>
>>>> 2. What is the scope of Spark application when it is used from
>>>> Zeppelin? For example, if I have few subsequent actions in zeppelin like
>>>> map,filter,reduceByKey, filter,collect. I assume this will translate to an
>>>> application and get submitted to Spark. However, If I want to use reuse
>>>> some part of the data (for example) after first map transformation in
>>>> earlier application. Can I do it? Or will it be another application and
>>>> another spark submit?
>>>>
>>>>  In our use case data will already be loaded in RDDs. So how Zeppelin
>>>> can access it?
>>>>
>>>> 3. How can I control access on specific rdds to specific users in
>>>> Zeppelin (assuming we have implemented some way of login mechanism in
>>>> Zeppelin and we have a mapping between Zeppelin users and their LDAP
>>>> accounts). Is it even possible?
>>>>
>>>> 4. If Zeppelin is not a good choice, yet, for the use case, what are
>>>> the other alternatives?
>>>>
>>>> appreciate any help/pointers/guidance.
>>>>
>>>>
>>>> --
>>>> Best Regards,
>>>> Ayan Guha
>>>>
>>>
>>>
>>>
>>> --
>>>
>>> Architect - Big Data
>>> Ph: +91 99805 99458
>>>
>>> Manthan Systems | *Company of the year - Analytics (2014 Frost and
>>> Sullivan India ICT)*
>>> +++
>>>
>>
>>
>
>
> --
> Best Regards,
> Ayan Guha
>



-- 
Best Regards,
Ayan Guha

Re: Zeppelin Integration

Reply via email to