Re: Any ways to connect BI tool to Spark without Hive

Chanh Le Thu, 07 Jul 2016 21:29:48 -0700

Hi Mich,

Actually technical users they can write some kind of complex machine learning 
things in the future too so that why zeppelin is promising.


> Those business users. Do they Oracle BI (OBI) to connect to DW like Oracle 
> now?
Yes, they are. Our data is still storing in Oracle but It’s becoming bigger and 
bigger everyday and some queries can’t execute in Oracle then we’re moving to 
another storage and using Spark to query. 

> I have also Hive running on Spark engine that makes such a solution easier by 
> allowing users to connect to Hive and execute their queries. You want to 
> provide a fast retrieval system for your users. Your case is interesting as 
> you have two parallel stack here.


So It means I still need to setup Hive and Hadoop? Because our resource is 
limited. We need to spend memory and cpu for Spark and Alluxio almost.

Thanks & regards,
Chanh




> On Jul 8, 2016, at 11:18 AM, Mich Talebzadeh <mich.talebza...@gmail.com> 
> wrote:
> 
> Interesting  Chanh
> 
> Those business users. Do they Oracle BI (OBI) to connect to DW like Oracle 
> now?
> 
> Certainly power users can use Zeppelin to write code that will be executed 
> through Spark but much doubt Zeppelin can do what OBI tool provides.
> 
> What you need is to investigate if OBI tool can connect to Spark Thrift 
> Server to use Spark to access your parquet files. Your parquet files are 
> already on HDFS (part of Hadoop).
> 
>  Hive has ODBC interfaces to Tableau and sure it can also work with OBI.
> 
> I have also Hive running on Spark engine that makes such a solution easier by 
> allowing users to connect to Hive and execute their queries. You want to 
> provide a fast retrieval system for your users. Your case is interesting as 
> you have two parallel stack here.
> 
> HTH
> 
> 
> Dr Mich Talebzadeh
>  
> LinkedIn  
> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>  
> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>
>  
> http://talebzadehmich.wordpress.com <http://talebzadehmich.wordpress.com/>
> 
> Disclaimer: Use it at your own risk. Any and all responsibility for any loss, 
> damage or destruction of data or any other property which may arise from 
> relying on this email's technical content is explicitly disclaimed. The 
> author will in no case be liable for any monetary damages arising from such 
> loss, damage or destruction.
>  
> 
> On 8 July 2016 at 04:58, Chanh Le <giaosu...@gmail.com 
> <mailto:giaosu...@gmail.com>> wrote:
> Hi Mich,
> Thanks for replying. Currently we think we need to separate 2 groups of user. 
> 1. Technical: Can write SQL 
> 2. Business: Can drag and drop fields or metrics and see the result.
> Our stack using Zeppeline, Spark SQL to query data from Alluxio. Our data 
> current store in parquet files. Zeppelin is using HiveContext but we haven’t 
> set up Hive and Hadoop yet. 
> 
> I am little bit confuse in Spark Thift Server because Thift Server in Spark 
> can allow external tools connect but is that require to set up Hive and 
> Hadoop?
> 
> Thanks and regards,
> Chanh
> 
> 
> 
>> On Jul 8, 2016, at 10:49 AM, Mich Talebzadeh <mich.talebza...@gmail.com 
>> <mailto:mich.talebza...@gmail.com>> wrote:
>> 
>> hi,
>> 
>> I have not used Alluxio but it is a distributed file system much like an 
>> IMDB say Oracle TimesTen. Spark is your query tool and Zeppelin is the GUI 
>> interface to your Spark which basically allows you graphs with Spark queries.
>> 
>> You mentioned Hive so I assume your persistent storage is Hive?
>> 
>> Your business are using Oracle BI tool. It is like Tableau. I assume Oracle 
>> BI tool accesses a database of some sort say Oracle DW using native 
>> connectivity and it may also have ODBC and JDBC connections to Hive etc.
>> 
>> The issue I see here is your GUI tool Zeppelin which does the same thing as 
>> Oracle BI tool. Can you please clarify below:
>> 
>> you use Hive as your database/persistent storage and use Alluxio on top of 
>> Hive?
>> are users accessing Hive or a Data Warehouse like Oracle
>> Oracle BI tools are pretty mature. Zeppelin is not in the same league so you 
>> have to decide which technology stack to follow
>> Spark should work with Oracle BI tool as well (need to check this) as a fast 
>> query tool. In that case the users can use Oracle BI tool with Spark as well.
>> It seems to me that the issue is that users don't want to move from Oracle 
>> BI tool. We had the same issue with Tableau. So you really need to make that 
>> Oracle BI tool use Spark and Alluxio and leave Zeppelin at one side.
>> 
>> Zeppelin as I used it a while back may not do what Oracle BI tool does. So 
>> the presentation layer has to be Oracle BI tool.
>> 
>> HTH
>> 
>> 
>> 
>> Dr Mich Talebzadeh
>>  
>> LinkedIn  
>> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>>  
>> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>
>>  
>> http://talebzadehmich.wordpress.com <http://talebzadehmich.wordpress.com/>
>> 
>> Disclaimer: Use it at your own risk. Any and all responsibility for any 
>> loss, damage or destruction of data or any other property which may arise 
>> from relying on this email's technical content is explicitly disclaimed. The 
>> author will in no case be liable for any monetary damages arising from such 
>> loss, damage or destruction.
>>  
>> 
>> On 8 July 2016 at 04:19, Chanh Le <giaosu...@gmail.com 
>> <mailto:giaosu...@gmail.com>> wrote:
>> Hi everyone,
>> Currently we use Zeppelin to analytics our data and because of using SQL 
>> it’s hard to distribute for users use. But users are using some kind of 
>> Oracle BI tools to analytic because it support some kinds of drag and drop 
>> and we can do some kind of permitted for each user.
>> Our architecture is Spark, Alluxio, Zeppelin. Because We want to share what 
>> we have done in Zeppelin to business users.
>> 
>> Is there any way to do that?
>> 
>> Thanks.
>> ---------------------------------------------------------------------
>> To unsubscribe e-mail: user-unsubscr...@spark.apache.org 
>> <mailto:user-unsubscr...@spark.apache.org>
>> 
>> 
> 
>

Re: Any ways to connect BI tool to Spark without Hive

Reply via email to