What version of Drill are you on ? Can you please share the query plan for your query ? Please share the table definitions ? Is this a standalone setup or cluster of Drillbits ?
On Fri, Nov 4, 2016 at 2:23 PM, <mridul.cho...@rbs.com.invalid> wrote: > Hello, > > I am working on a proof of concept to optimize the performance of the > join queries executed through drill. The underlying storage is a NO-SQL > based database - Mongo DB. The time it takes to return the result of the > join query is very high (46 seconds). Upon further analysis, as per the > physical plan of the query it is observed that both the left side (15 lakh > records) and right side table (13 lakh) are fully scanned and it takes 24 > seconds and 20 seconds respectively for the same. > > Here is the query - > > select ta.[SOME_COLUMN] from mongo.Test.TABLEA ta INNER JOIN > mongo.Test.TABLEB ta ON ta.Id = tb.Id and ta.Id ='123' > > Records in table A : 15 Lakhs > > Records in table B : 13 Lakhs > > Filter condition : Id is indexed field in both the tables (ascending) > > Drill Plan shows Hash-join being performed as shown in the plan (as > attached above) > > I have the following questions - > > 1) Why does drill fetch all the records into the memory even though > filter condition on the indexed column was provided for one of the table ? > > 2) If the Drill planner/optimizer is intelligent then records could > have been filtered on the second table as well based on the join condition > (to reduce the dataset and hence result in faster execution time) > > 3) Are there any performance tuning options available to bring down > the query response time and generate a better query plan ? > > > > I would appreciate if you can provide me pointers/directions or answers to > the above questions. > > Please feel free if you need any additional information. > > Thanks, > > Mridul > > > > > > > > > *********************************************************************************** > > The Royal Bank of Scotland plc. Registered in Scotland No 90312. > Registered Office: 36 St Andrew Square, Edinburgh EH2 2YB. > Authorised by the Prudential Regulation Authority and regulated > by the Financial Conduct Authority and Prudential Regulation Authority. > The Royal Bank of Scotland N.V. is authorised and regulated by the > De Nederlandsche Bank and has its seat at Amsterdam, the > Netherlands, and is registered in the Commercial Register under > number 33002587. Registered Office: Gustav Mahlerlaan 350, > Amsterdam, The Netherlands. The Royal Bank of Scotland N.V. and > The Royal Bank of Scotland plc are authorised to act as agent for each > other in certain jurisdictions. > > This e-mail message is confidential and for use by the addressee only. > If the message is received by anyone other than the addressee, please > return the message to the sender by replying to it and then delete the > message from your computer. Internet e-mails are not necessarily > secure. The Royal Bank of Scotland plc and The Royal Bank of Scotland > N.V. including its affiliates ("RBS group") does not accept responsibility > for changes made to this message after it was sent. For the protection > of RBS group and its clients and customers, and in compliance with > regulatory requirements, the contents of both incoming and outgoing > e-mail communications, which could include proprietary information and > Non-Public Personal Information, may be read by authorised persons > within RBS group other than the intended recipient(s). > > Whilst all reasonable care has been taken to avoid the transmission of > viruses, it is the responsibility of the recipient to ensure that the > onward > transmission, opening or use of this message and any attachments will > not adversely affect its systems or data. No responsibility is accepted > by the RBS group in this regard and the recipient should carry out such > virus and other checks as it considers appropriate. > > Visit our website at www.rbs.com > *********************************************************************************** > >