Hi, Any suggestions on this approach?
Regards, Rajesh On Sat, Jan 23, 2016 at 11:24 PM, Madabhattula Rajesh Kumar < mrajaf...@gmail.com> wrote: > Hi, > > I have a big database table(1 million plus records) in oracle. I need to > query records based on input numbers. For this use case, I am doing below > steps > > I am creating two data frames. > > DF1 = I am computing this DF1 using sql query. It has one million + > records. > > DF2 = I have a list of numbers. I am converting list of input numbers to > data-frame > > I am converting DF1 and DF2 to register temp table and forming sql query. > It will return input number of records > > Steps :- > > DF1.registerTempTable("E1") > > DF2.registerTempTable("E2") > > DF3 = sqlContext.sql(select * from E1, E2 where E1.id = E2.id) > > DF3.map(row => (row(0),row(1),row(2))).saveToCassandra(keyspace, table1) > > *query :-* > > How D3 records will fetch? > > Is DF1 load entire table data(1 million plus records) into memory when > joining with DF2 ? (Or) It will fetch only DF2 matched records from oracle > and load into memory. > > Please clarify and let me know my approach is correct > > Regards, > Rajesh > > > > > > > >