Would it be possible to use views to address some of your requirements?

Alternatively it might be better to parse it yourself. There are open source 
libraries for it, if you need really a complete sql parser. Do you want to do 
it on sub queries?

> On 05 Nov 2015, at 23:34, Yana Kadiyska <yana.kadiy...@gmail.com> wrote:
> 
> Hi folks, not sure if this belongs to dev or user list..sending to dev as it 
> seems a bit convoluted.
> 
> I have a UI in which we allow users to write ad-hoc queries against a (very 
> large, partitioned) table. I would like to analyze the queries prior to 
> execution for two purposes:
> 
> 1. Reject under-constrained queries (i.e. there is a field predicate that I 
> want to make sure is always present)
> 2. Augment the query with additional predicates (e.g if the user asks for a 
> student_id I also want to push a constraint on another field)
> 
> I could parse the sql string before passing to spark but obviously spark 
> already does this anyway. Can someone give me general direction on how to do 
> this (if possible).
> 
> Something like
> 
> myDF = sql("user_sql_query")
> myDF.queryExecution.logical  //here examine the filters provided by user, 
> reject if underconstrained, push new filters as needed (via withNewChildren?)
>  
> at this point with some luck I'd have a new LogicalPlan -- what is the proper 
> way to create an execution plan on top of this new Plan? Im looking at this 
> https://github.com/apache/spark/blob/master/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveContext.scala#L329
>  but this method is restricted to the package. I'd really prefer to hook into 
> this as early as possible and still let spark run the plan optimizations as 
> usual.
> 
> Any guidance or pointers much appreciated.

Reply via email to