Re: Spark on Oracle available as an Apache licensed open source repo

Mich Talebzadeh Fri, 14 Jan 2022 02:53:06 -0800

Hello,

Thanks for this info.

Have you tested this feature on Oracle on-premise say, 11c, 12c besides ADW
in Cloud?

I can see the transactional feature useful in terms of commit/rollback to
Oracle but I cannot figure out the performance gains in your blog etc.

My concern is we currently connect to Oracle as well as many other JDBC
compliant databases  through Spark generic JDBC connections
<https://spark.apache.org/docs/latest/sql-data-sources-jdbc.html> with the
same look and feel. Unless there is an overriding reason, I don't  see why
there is a need to switch to this feature.

Cheers

   view my Linkedin profile
<https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>

*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.

On Fri, 14 Jan 2022 at 00:50, Harish Butani <rhbutani.sp...@gmail.com>
wrote:

> Spark on Oracle is now available as an open source Apache licensed github
> repo <https://github.com/oracle/spark-oracle>. Build and deploy it as an
> extension jar in your Spark clusters.
>
> Use it to combine Apache Spark programs with data in your existing Oracle
> databases without expensive data copying or query time data movement.
>
> The core capability is Optimizer extensions that collapse SQL operator
> sub-graphs to an OraScan that executes equivalent SQL in Oracle. Physical
> plan parallelism
> <https://github.com/oracle/spark-oracle/wiki/Query-Splitting>can be
> controlled to split Spark tasks to operate on Oracle data block ranges, or
> on resultset pages or on table partitions.
>
> We pushdown large parts of Spark SQL to Oracle, for example 95 of 99 TPCDS
> queries are completely pushed to Oracle.
> <https://github.com/oracle/spark-oracle/wiki/TPCDS-Queries>
>
> With Spark SQL macros
> <https://github.com/oracle/spark-oracle/wiki/Spark_SQL_macros>  you can
> write custom Spark UDFs that get translated and pushed as Oracle SQL
> expressions.
>
> With DML pushdown
> <https://github.com/oracle/spark-oracle/wiki/DML-Support> inserts in
> Spark SQL get pushed as transactionally consistent inserts/updates on
> Oracle tables.
>
> See Quick Start Guide
> <https://github.com/oracle/spark-oracle/wiki/Quick-Start-Guide>  on how
> to set up an Oracle free tier ADW instance, load it with TPCDS data and try
> out the Spark on Oracle Demo
> <https://github.com/oracle/spark-oracle/wiki/Demo>  on your Spark
> cluster.
>
> More  details can be found in our blog
> <https://hbutani.github.io/blogs/blog/Spark_on_Oracle_Blog.html> and the 
> project
> wiki. <https://github.com/oracle/spark-oracle/wiki>
>
> regards,
> Harish Butani
>

Re: Spark on Oracle available as an Apache licensed open source repo

Reply via email to