Hi,

I have a big database table(1 million plus records) in oracle. I need to
query records based on input numbers. For this use case, I am doing below
steps

I am creating two data frames.

DF1 = I am computing this DF1 using sql query. It has one million +
records.

DF2 = I have a list of numbers. I am converting list of input numbers to
data-frame

I am converting DF1 and DF2 to register temp table and forming sql query.
It will return input number of records

Steps :-

DF1.registerTempTable("E1")

DF2.registerTempTable("E2")

DF3 = sqlContext.sql(select * from E1, E2 where E1.id = E2.id)

DF3.map(row => (row(0),row(1),row(2))).saveToCassandra(keyspace, table1)

*query :-*

How D3 records will fetch?

Is DF1 load entire table data(1 million plus records) into memory when
joining with DF2 ? (Or) It will fetch only DF2 matched records from oracle
and load into memory.

Please clarify and let me know my approach is correct

Regards,
Rajesh

Reply via email to