Thanks a lot Pat, where do you want me to send? On Mar 5, 12:44 am, Pat Allan <[email protected]> wrote: > Hi Pedro > > Looks like there's something seriously wrong in the generated SQL statement. > If possible (and let's do this off-list), can I get access to the app's code > so I can reproduce the problem locally, and see if I can fix it? > > -- > Pat > > On 05/03/2011, at 9:55 AM, Pedro Cunha wrote: > > > > > > > > > Pat could you take a look at:http://dl.dropbox.com/u/1175879/output.zip > > > As you can see, this is an output for just 1 restaurant and it's > > totally weird :\ > > > Best Regards, > > Pedro > > > On Mar 3, 6:57 am, Pat Allan <[email protected]> wrote: > >> Hi Pedro > > >> Not that it's particularly helpful, but this is definitely not how Sphinx > >> behaves. It definitely should not take an hour toindex15 documents. It > >> shouldn't even take a minute, unless the query is particularly complex - > >> and this isn't what I'd call complex! > > >> The reason for adding the group_by call, is that PostgreSQL is strict with > >> the SQL it expects - anything that isn't within an aggregation function > >> should be mentioned in the GROUP BY clause. > > >> Thinking Sphinx will do this automatically for any fields it references to > >> - but it can't look into SQL snippets to figure out which columns are > >> used, hence why you need to manually add it yourself, because of your > >> CRC'd state attribute. > > >> The repeated data is definitely odd. Thinking Sphinx concatenates data > >> from columns in associations, but the query shouldn't duplicate data > >> (unless the record is attached to many associated records that look the > >> same - but that's not a SQL problem, that's a data problem). > > >> When you limit the query to one restaurant, can you confirm it's joining > >> only on the menu categories and menu items you're expecting? (Add the id > >> columns from both tables to the start of your query, and to the GROUP BY > >> clause). > > >> -- > >> Pat > > >> On 03/03/2011, at 6:13 AM, Pedro Cunha wrote: > > >>> Hello Pat, > > >>> Have been investigating last days and I'm almost at the same place. > > >>> I'm using PostgreSQL. > > >>> As far as I read we need to trust how postgres analyses queries and he > >>> can choose or not to use indexes. > > >>> Herehttp://explain.depesz.com/s/sA9isa bit (only a bit..) more > >>> detailed the EXPLAIN ANALYSE. > > >>> I've been digging through the query that TS generates, and for the 10 > >>> entities (restaurants) i'm trying toindex, 8500~ rows are created. > > >>> What is the reason for when including all "has x" to use "group by"? > > >>> Also I noticed something weird, imagine I pick one of the lines of the > >>> output, and now pick a column, is it supposed to see lots of > >>> information repeated? like: > > >>> ".. Negócios Negócios Negócios Negócios Negócios Negócios Sair a dois > >>> Negócios Negócios Negócios Negócios Negócios Negócios Negócios Sair a > >>> dois Negócios Negócios Negócios Sair a dois Negócios Sair a dois > >>> Descontrair Sair a dois Sair a dois Sair a dois Sair a dois Sair a > >>> dois Sair a dois Sair a dois Sair a dois Sair a dois Sair a dois Sair > >>> a dois Sair a dois Sair a dois Sair a dois Sair a dois Sair a dois > >>> Sair a dois Sair a dois Sair a dois Sair a dois Sair a dois Sair a > >>> dois Sair a dois Sair a dois Sair a dois Sair a dois Sair a dois Sair > >>> a dois Sair a dois Sair a dois Sair a dois Sair a dois Sair a dois > >>> Sair a dois Sair a dois Sair a dois Sair a dois Sair a dois Sair a > >>> dois Sair a dois Sair a dois Sair a dois Sair a dois Sair a dois Sair > >>> a dois Sair a dois Sair a dois Sair a dois Sair a dois Sair a dois > >>> Sair a dois Sair a dois Sair a dois Sair a dois Sair a dois Sair a > >>> dois Sair a dois Sair a dois ..." > > >>> Production database is at the moment with 15 restaurants.. and it's > >>> taking 1h toindex > > >>> As DB grows, time is not incremental but exponential and this is > >>> worrying me a lot. Also because delta indexing is taking around 15~ > >>> mins. > > >>> Is it supposed that 1:N associations decay so badly? > > >>> I'm running out of ideas and considering breaking references for this > >>> case to save one inner join to happen. > > >>> Restaurant (1 : N ) Menu Categories > >>> Menu Categories ( 1 : N ) Menu Items <====== Shortcut this one > > >>> Best regards, > >>> Pedro > > >>> On Feb 24, 3:59 am, Pat Allan <[email protected]> wrote: > >>>> Hi Pedro > > >>>> Wow, PostgreSQL's EXPLAIN output is pretty complex! > > >>>> I'm not spotting anything obviously odd, but I'm not sure what to look > >>>> for. > > >>>> Perhaps you try it all again, but comment out all the fields and > >>>> attributes except those from menu items? Indexing should be a bit > >>>> faster, but I'd imagine there's still a speed hit to some extent. I'm > >>>> not sure if that'll shine any light on the situation though... > > >>>> Is this all on your development environment? If it's production or > >>>> staging, are you sure that the database indexes on the foreign keys > >>>> exist? (ie: latest migrations have been run) > > >>>> -- > >>>> Pat > > >>>> On 24/02/2011, at 11:47 AM, Pedro Cunha wrote: > > >>>>> Hello Pat, > > >>>>> Here is the link: > >>>>>https://gist.github.com/841520 > > >>>>> With: > >>>>> - define_index block > >>>>> - sql query on conf > >>>>> - explain > >>>>> - explain analyse > > >>>>> Thanks in advance > > >>>>> On Feb 23, 11:51 pm, Pat Allan <[email protected]> wrote: > >>>>>> Hi Pedro > > >>>>>> Perhaps post the EXPLAIN and EXPLAIN ANALYSE outputs to a gist? To be > >>>>>> honest, given I've not used the PostgreSQL equivalents, I'm not sure > >>>>>> how much sense I'll be able to make out of it, but let's start with > >>>>>> that. > > >>>>>> Also, can you add the SQL query to the gist as well? > > >>>>>> Cheers > > >>>>>> -- > >>>>>> Pat > > >>>>>> On 24/02/2011, at 6:07 AM, Pedro Cunha wrote: > > >>>>>>> Hey Pat, I'm using PostgresSQL, and command is the same. > > >>>>>>> I tried looking at the output and it kinda looks like chinese to me. > >>>>>>> From what I see, I can see no reference to indexes whatsoever. That > >>>>>>> can be a problem. According to Postgres docs I must do an ANALYZE > >>>>>>> <table_name> to make pg server aware of anindexon that specific > >>>>>>> table. > > >>>>>>> After I ve done that and run the EXPLAIN and EXPLAIN ANALYSE I still > >>>>>>> can't see any reference to indexes. That might be a problem. > > >>>>>>> Do you want me to copy paste here the whole log or is there some way I > >>>>>>> can attach that file to this thread? (it's a bit lengthy) > > >>>>>>> Best regards > > >>>>>>> On Feb 23, 12:31 am, Pat Allan <[email protected]> wrote: > >>>>>>>> Hi Pedro > > >>>>>>>> Given you've got the database indexes, this definitely feels far > >>>>>>>> tooslow. Are you using MySQL or PostgreSQL? > > >>>>>>>> If you're using MySQL, then I would recommend taking the generated > >>>>>>>> sql_query value from the restaurant_core_0 source in your sphinx > >>>>>>>> configuration file, remove the $start and $end parts of the WHERE > >>>>>>>> clause, and add EXPLAIN to the start of the query. Then, run this > >>>>>>>> query in a MySQL console: > > >>>>>>>> EXPLAIN SELECT ... > > >>>>>>>> This should give you some information on what database indexes are > >>>>>>>> being used for the query. > > >>>>>>>> I'm not sure what the equivalent of the EXPLAIN command for > >>>>>>>> PostgreSQL is, but I'd be surprised if such a thing didn't exist. > > >>>>>>>> Cheers > > >>>>>>>> -- > >>>>>>>> Pat > > >>>>>>>> On 23/02/2011, at 1:32 AM, Pedro Cunha wrote: > > >>>>>>>>> Hello there, > > >>>>>>>>> I'm having a weirdslowindexing. Tried looking for similar issues on > >>>>>>>>> this group and couldn't find. > > >>>>>>>>> When Iindexan has_many through association, perfomance decays a lot. > >>>>>>>>> To help trace this issue I ll give as much info as I can. > > >>>>>>>>> So the association looks like this: > > >>>>>>>>> Restaurant (1 : N ) Menu Categories > >>>>>>>>> Menu Categories ( 1 : N ) Menu Items > > >>>>>>>>>Indexon table menu_categories for restaurant_id (DB level) > >>>>>>>>>Indexon table menu_items for menu_category_id (DB level) > > >>>>>>>>> define_index on restaurant > >>>>>>>>> ... > >>>>>>>>> indexes menu_categories.menu_items(:name), :as => :menu_item_name > >>>>>>>>> indexes menu_categories.menu_items(:subtitle), :as > >>>>>>>>> => :menu_item_subtitle > >>>>>>>>> indexes menu_categories.menu_items(:description), :as > >>>>>>>>> => :menu_item_description > >>>>>>>>> ... > >>>>>>>>> end > > >>>>>>>>> Without the indexes before, and with a lot of other stuff I get this > >>>>>>>>> perfomance: > > >>>>>>>>> indexingindex'restaurant_core'... > >>>>>>>>> collected 12 docs, 3.0 MB > >>>>>>>>> collected 122936 attr values > >>>>>>>>> sorted 0.2 Mvalues, 100.0% done > >>>>>>>>> sorted 2.2 Mhits, 100.0% done > >>>>>>>>> total 12 docs, 2973437 bytes > >>>>>>>>> total 16.741 sec, 177611 bytes/sec, 0.71 docs/sec > >>>>>>>>> distributedindex'restaurant' can not be directly indexed; skipping. > > >>>>>>>>> With the indexes > > >>>>>>>>> indexingindex'restaurant_core'... > >>>>>>>>> collected 12 docs, 15.9 MB > >>>>>>>>> collected 423920 attr values > >>>>>>>>> sorted 0.8 Mvalues, 100.0% done > >>>>>>>>> sorted 11.4 Mhits, 100.0% done > >>>>>>>>> total 12 docs, 15885288 bytes > >>>>>>>>> total 320.783 sec, 49520 bytes/sec, 0.03 docs/sec > >>>>>>>>> distributedindex'restaurant' can not be directly indexed; skipping. > > >>>>>>>>> table ids are pretty small so I think that issue is not causing > >>>>>>>>> problems.. still I tried adding the sql step config and performance > >>>>>>>>> stays the same > > >>>>>>>>> Properties on define_index are: > > >>>>>>>>> set_property :min_prefix_len => 3 > >>>>>>>>> set_property :stopwords => "#{RAILS_ROOT}/config/sphinx/ > >>>>>>>>> stoplist.txt" > >>>>>>>>> set_property :delta => ThinkingSphinx::Deltas::ResqueDelta > > >>>>>>>>> Tried commenting min_prefix_len and perfomance keeps the same.. > > >>>>>>>>> Running out of ideas :( > > >>>>>>>>> Best regards, > >>>>>>>>> Pedro > > >>>>>>>>> -- > >>>>>>>>> You received this message because you are subscribed to the Google > >>>>>>>>> Groups "Thinking Sphinx" group. > >>>>>>>>> To post to this group, send email to > >>>>>>>>> [email protected]. > >>>>>>>>> To > > ... > > read more »
-- You received this message because you are subscribed to the Google Groups "Thinking Sphinx" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/thinking-sphinx?hl=en.
