I am implementing this approach currently.

A
1.Create data tables in spark-sql and cache them.
2. Configure the hive metastore to read the cached tables and share the
same metastore as spark-sql (You get the spark caching advantage)
3.Run spark code to fetch form the cached tables. In the spark code you can
genrate queries at runtime.


On Wed, Feb 11, 2015 at 4:12 PM, Ashish Mukherjee <
ashish.mukher...@gmail.com> wrote:

> Hi,
>
> I am planning to use Spark for a Web-based adhoc reporting tool on massive
> date-sets on S3. Real-time queries with filters, aggregations and joins
> could be constructed from UI selections.
>
> Online documentation seems to suggest that SharkQL is deprecated and users
> should move away from it.  I understand Hive is generally not used for
> real-time querying and for Spark SQL to work with other data stores, tables
> need to be registered explicitly in code. Also, the This would not be
> suitable for a dynamic query construction scenario.
>
> For a real-time , dynamic querying scenario like mine what is the proper
> tool to be used with Spark SQL?
>
> Regards,
> Ashish
>



-- 

[image: Sigmoid Analytics] <http://htmlsig.com/www.sigmoidanalytics.com>

*Arush Kharbanda* || Technical Teamlead

ar...@sigmoidanalytics.com || www.sigmoidanalytics.com

Reply via email to