Both of these queries will do an in-memory scans. This will cause performance 
penalty with increased volume of data.

You could get better performance if you use Basic Search that in-turn uses the 
fulltext indexes. Only that you will have to do some processing yourself.

~ ashutosh


From: Verdan Mahmood <[email protected]>
Date: Wednesday, August 14, 2019 at 10:39 AM
To: "[email protected]" <[email protected]>, Ashutosh Mestry 
<[email protected]>
Cc: "[email protected]" <[email protected]>
Subject: Re: Help needed with Production Configurations

We are using Atlas 2.0

We do have following relations:

Table(DataSet):

hive_table(Table):
metadata = Metadata

Metadata:
popularityScore

and one of our query is

FROM Table SELECT metadata.__guid ORDERBY popularityScore desc LIMIT 10

Another one is for the wildcard search

Table from Table where name like '*{query_term}*' or description like 
'*{query_term}*'
Any suggestions on how do you improve those. Use the fulltext/freetext searches 
?


Best,
Verdan Mahmood



On Wed, Aug 14, 2019 at 7:22 PM Ashutosh Mestry <[email protected]> 
wrote:
What version are you using?

DSL queries mostly not use indexes, hence having good configuration of Solr may 
not be of much value. The pre-1.0 version of DSL is not very efficient in terms 
of performance.

What kind of queries do you have? Can you post some examples?

Few things you could try:
- Increase Atlas' memory. DSL uses in memory filtering for certain kind of 
queries. Additional memory may help.
- Increase number of shards in Solr (default is 1, adding shards will help 
improve throughput). You will have to use Solr API calls for adding shards. The 
num_shards property within gets used only during the very first start.
- Analyze your queries, see if you can use Basic Search, since it is optimized 
to use Solr primarily.

~ ashutosh


On 8/14/19, 10:10 AM, "Verdan Mahmood" 
<[email protected]<mailto:[email protected]>> wrote:

    Hello Community,

    We have deployed the Atlas-Solr in our production environment recently, and
    have around 30k hive tables, with a couple of custom entity types.
    The DSL queries are pretty slow and taking an average of 15 seconds each
    time.
    Our Solr is pretty efficient and works pretty well with all kind of queries
    from Solr UI.

    Do you guys have some kind of configurations that you fine tune to get the
    most out of Atlas?


    Best,
    *Verdan Mahmood*

Reply via email to