I think you probably need to write some code as you need to support the ES, 
there are 2 options per my understanding:

Create a new Data Source from scratch, but you probably need to overwrite the 
interface at:
https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/sources/interfaces.scala#L751

Or you can reuse most of code in ParquetRelation in the new DataSource, but 
also need to modify your own logic, see
https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetRelation.scala#L285

Hope it helpful.

Hao
From: james.gre...@baesystems.com [mailto:james.gre...@baesystems.com]
Sent: Thursday, November 19, 2015 11:14 PM
To: dev@spark.apache.org
Subject: new datasource



We have written a new Spark DataSource that uses both Parquet and 
ElasticSearch.  It is based on the existing Parquet DataSource.   When I look 
at the filters being pushed down to buildScan I don’t get anything representing 
any filters based on UDFs – or for any fields generated by an explode – I had 
thought if I made it a CatalystScan I would get everything I needed.



This is fine from the Parquet point of view – but we are using ElasticSearch to 
index/filter the data we are searching and I need to be able to capture the UDF 
conditions – or have access to the Plan AST in order that I can construct a 
query for ElasticSearch.



I am thinking I might just need to patch Spark to do this – but I’d prefer not 
too if there is a way of getting round this without hacking the core code.  Any 
ideas?



Thanks



James


Please consider the environment before printing this email. This message should 
be regarded as confidential. If you have received this email in error please 
notify the sender and destroy it immediately. Statements of intent shall only 
become binding when confirmed in hard copy by an authorised signatory. The 
contents of this email may relate to dealings with other companies under the 
control of BAE Systems Applied Intelligence Limited, details of which can be 
found at http://www.baesystems.com/Businesses/index.htm.

Reply via email to