I concur with Gourav on this. Both SAP HANNA and Oracle exalytics push
serial scans into HW with HANA, it is by pushing the bitmaps into the L2
cache on the chip whilst Oracle has special processors on SPARC T5 called
D???? <something> that offloads the column bit scan off the cpu and onto
separate specialized HW.    As a result, both rely on massive
parallelization.  HANA is a true column store and does not end up
duplicating the data as Oracle does. Now going back to using Spark as
middleware accessing SAP HANA , it may make sense if the objective is to
extract data from SAP HANA and save it into HDFS. However, I am pretty sure
HANA already has the capability to do so. I am more familiar with SAP
Sybase IQ and I have tested its drivers and they work with Spark.

HTH

Dr Mich Talebzadeh



LinkedIn * 
https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*



http://talebzadehmich.wordpress.com



On 29 March 2016 at 18:42, Gourav Sengupta <gourav.sengu...@gmail.com>
wrote:

> Hi Reena,
>
> Why would you want to run a SPARK off data in SAP HANA? Is not SAP HANA
> already an in memory, columnar storage, SAP bells-and-whistles, super-duper
> expensive way of doing what poor people do in SPARK sans SAP ERP
> integration layers?
>
> I am just trying to understand the used case here.
>
> Regards,
> Gourav
>
> On Tue, Mar 29, 2016 at 3:54 PM, reena upadhyay <
> reena.upadh...@impetus.co.in> wrote:
>
>> I am trying to execute query using spark sql on SAP HANA  from spark
>> shell. I
>> am able to create the data frame object. On calling any action on the data
>> frame object, I am getting* java.io.NotSerializableException.*
>>
>> Steps I followed after adding saphana driver jar in spark class path.
>>
>> 1. Start spark-shell
>> 2. val df = sqlContext.load("jdbc", Map("url" ->
>> "jdbc:sap://
>> 172.26.52.54:30015/?databaseName=system&user=SYSTEM&password=Saphana123",
>> "dbtable" -> "SYSTEM.TEST1"));
>> 3. df.show();
>>
>> *I get below exception on calling any action on dataframe object.*
>>
>> *org.apache.spark.SparkException: Job aborted due to stage failure: Task
>> not
>> serializable: java.io.NotSerializableException:
>> com.sap.db.jdbc.topology.Host*
>> Serialization stack:
>>         - object not serializable (class: com.sap.db.jdbc.topology.Host,
>> value:
>> 172.26.52.54:30015)
>>         - writeObject data (class: java.util.ArrayList)
>>         - object (class java.util.ArrayList, [172.26.52.54:30015])
>>         - writeObject data (class: java.util.Hashtable)
>>         - object (class java.util.Properties, {dburl=jdbc:sap://
>> 172.26.52.54:30015,
>> user=SYSTEM, password=Saphana123,
>> url=jdbc:sap://172.26.52.54:30015/?system&user=SYSTEM&password=Saphana123
>> ,
>> dbtable=SYSTEM.TEST1, hostlist=[172.26.52.54:30015]})
>>
>>
>> Caused by: java.io.NotSerializableException: com.sap.db.jdbc.topology.Host
>> Serialization stack:
>>         - object not serializable (class: com.sap.db.jdbc.topology.Host,
>> value:
>> 172.26.52.54:30015)
>>         - writeObject data (class: java.util.ArrayList)
>>         - object (class java.util.ArrayList, [172.26.52.54:30015])
>>         - writeObject data (class: java.util.Hashtable)
>>         - object (class java.util.Properties, {dburl=jdbc:sap://
>> 172.26.52.54:30015,
>> user=SYSTEM, password=Saphana123,
>> url=jdbc:sap://172.26.52.54:30015/?system&user=SYSTEM&password=Saphana123
>> ,
>> dbtable=SYSTEM.TEST1, hostlist=[172.26.52.54:30015]})
>>
>>
>> Appreciate help on this.
>> Thank you
>>
>>
>>
>> --
>> View this message in context:
>> http://apache-spark-user-list.1001560.n3.nabble.com/Unable-to-execute-query-on-SAPHANA-using-SPARK-tp26628.html
>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>> For additional commands, e-mail: user-h...@spark.apache.org
>>
>>
>

Reply via email to