Hi Pedro Looks like there's something seriously wrong in the generated SQL statement. If possible (and let's do this off-list), can I get access to the app's code so I can reproduce the problem locally, and see if I can fix it?
-- Pat On 05/03/2011, at 9:55 AM, Pedro Cunha wrote: > Pat could you take a look at: http://dl.dropbox.com/u/1175879/output.zip > > As you can see, this is an output for just 1 restaurant and it's > totally weird :\ > > Best Regards, > Pedro > > On Mar 3, 6:57 am, Pat Allan <[email protected]> wrote: >> Hi Pedro >> >> Not that it's particularly helpful, but this is definitely not how Sphinx >> behaves. It definitely should not take an hour to index 15 documents. It >> shouldn't even take a minute, unless the query is particularly complex - and >> this isn't what I'd call complex! >> >> The reason for adding the group_by call, is that PostgreSQL is strict with >> the SQL it expects - anything that isn't within an aggregation function >> should be mentioned in the GROUP BY clause. >> >> Thinking Sphinx will do this automatically for any fields it references to - >> but it can't look into SQL snippets to figure out which columns are used, >> hence why you need to manually add it yourself, because of your CRC'd state >> attribute. >> >> The repeated data is definitely odd. Thinking Sphinx concatenates data from >> columns in associations, but the query shouldn't duplicate data (unless the >> record is attached to many associated records that look the same - but >> that's not a SQL problem, that's a data problem). >> >> When you limit the query to one restaurant, can you confirm it's joining >> only on the menu categories and menu items you're expecting? (Add the id >> columns from both tables to the start of your query, and to the GROUP BY >> clause). >> >> -- >> Pat >> >> On 03/03/2011, at 6:13 AM, Pedro Cunha wrote: >> >> >> >> >> >> >> >>> Hello Pat, >> >>> Have been investigating last days and I'm almost at the same place. >> >>> I'm using PostgreSQL. >> >>> As far as I read we need to trust how postgres analyses queries and he >>> can choose or not to use indexes. >> >>> Herehttp://explain.depesz.com/s/sA9is a bit (only a bit..) more >>> detailed the EXPLAIN ANALYSE. >> >>> I've been digging through the query that TS generates, and for the 10 >>> entities (restaurants) i'm trying to index, 8500~ rows are created. >> >>> What is the reason for when including all "has x" to use "group by"? >> >>> Also I noticed something weird, imagine I pick one of the lines of the >>> output, and now pick a column, is it supposed to see lots of >>> information repeated? like: >> >>> ".. Negócios Negócios Negócios Negócios Negócios Negócios Sair a dois >>> Negócios Negócios Negócios Negócios Negócios Negócios Negócios Sair a >>> dois Negócios Negócios Negócios Sair a dois Negócios Sair a dois >>> Descontrair Sair a dois Sair a dois Sair a dois Sair a dois Sair a >>> dois Sair a dois Sair a dois Sair a dois Sair a dois Sair a dois Sair >>> a dois Sair a dois Sair a dois Sair a dois Sair a dois Sair a dois >>> Sair a dois Sair a dois Sair a dois Sair a dois Sair a dois Sair a >>> dois Sair a dois Sair a dois Sair a dois Sair a dois Sair a dois Sair >>> a dois Sair a dois Sair a dois Sair a dois Sair a dois Sair a dois >>> Sair a dois Sair a dois Sair a dois Sair a dois Sair a dois Sair a >>> dois Sair a dois Sair a dois Sair a dois Sair a dois Sair a dois Sair >>> a dois Sair a dois Sair a dois Sair a dois Sair a dois Sair a dois >>> Sair a dois Sair a dois Sair a dois Sair a dois Sair a dois Sair a >>> dois Sair a dois Sair a dois ..." >> >>> Production database is at the moment with 15 restaurants.. and it's >>> taking 1h to index >> >>> As DB grows, time is not incremental but exponential and this is >>> worrying me a lot. Also because delta indexing is taking around 15~ >>> mins. >> >>> Is it supposed that 1:N associations decay so badly? >> >>> I'm running out of ideas and considering breaking references for this >>> case to save one inner join to happen. >> >>> Restaurant (1 : N ) Menu Categories >>> Menu Categories ( 1 : N ) Menu Items <====== Shortcut this one >> >>> Best regards, >>> Pedro >> >>> On Feb 24, 3:59 am, Pat Allan <[email protected]> wrote: >>>> Hi Pedro >> >>>> Wow, PostgreSQL's EXPLAIN output is pretty complex! >> >>>> I'm not spotting anything obviously odd, but I'm not sure what to look for. >> >>>> Perhaps you try it all again, but comment out all the fields and >>>> attributes except those from menu items? Indexing should be a bit faster, >>>> but I'd imagine there's still a speed hit to some extent. I'm not sure if >>>> that'll shine any light on the situation though... >> >>>> Is this all on your development environment? If it's production or >>>> staging, are you sure that the database indexes on the foreign keys exist? >>>> (ie: latest migrations have been run) >> >>>> -- >>>> Pat >> >>>> On 24/02/2011, at 11:47 AM, Pedro Cunha wrote: >> >>>>> Hello Pat, >> >>>>> Here is the link: >>>>> https://gist.github.com/841520 >> >>>>> With: >>>>> - define_index block >>>>> - sql query on conf >>>>> - explain >>>>> - explain analyse >> >>>>> Thanks in advance >> >>>>> On Feb 23, 11:51 pm, Pat Allan <[email protected]> wrote: >>>>>> Hi Pedro >> >>>>>> Perhaps post the EXPLAIN and EXPLAIN ANALYSE outputs to a gist? To be >>>>>> honest, given I've not used the PostgreSQL equivalents, I'm not sure how >>>>>> much sense I'll be able to make out of it, but let's start with that. >> >>>>>> Also, can you add the SQL query to the gist as well? >> >>>>>> Cheers >> >>>>>> -- >>>>>> Pat >> >>>>>> On 24/02/2011, at 6:07 AM, Pedro Cunha wrote: >> >>>>>>> Hey Pat, I'm using PostgresSQL, and command is the same. >> >>>>>>> I tried looking at the output and it kinda looks like chinese to me. >>>>>>> From what I see, I can see no reference to indexes whatsoever. That >>>>>>> can be a problem. According to Postgres docs I must do an ANALYZE >>>>>>> <table_name> to make pg server aware of an index on that specific >>>>>>> table. >> >>>>>>> After I ve done that and run the EXPLAIN and EXPLAIN ANALYSE I still >>>>>>> can't see any reference to indexes. That might be a problem. >> >>>>>>> Do you want me to copy paste here the whole log or is there some way I >>>>>>> can attach that file to this thread? (it's a bit lengthy) >> >>>>>>> Best regards >> >>>>>>> On Feb 23, 12:31 am, Pat Allan <[email protected]> wrote: >>>>>>>> Hi Pedro >> >>>>>>>> Given you've got the database indexes, this definitely feels far too >>>>>>>> slow. Are you using MySQL or PostgreSQL? >> >>>>>>>> If you're using MySQL, then I would recommend taking the generated >>>>>>>> sql_query value from the restaurant_core_0 source in your sphinx >>>>>>>> configuration file, remove the $start and $end parts of the WHERE >>>>>>>> clause, and add EXPLAIN to the start of the query. Then, run this >>>>>>>> query in a MySQL console: >> >>>>>>>> EXPLAIN SELECT ... >> >>>>>>>> This should give you some information on what database indexes are >>>>>>>> being used for the query. >> >>>>>>>> I'm not sure what the equivalent of the EXPLAIN command for PostgreSQL >>>>>>>> is, but I'd be surprised if such a thing didn't exist. >> >>>>>>>> Cheers >> >>>>>>>> -- >>>>>>>> Pat >> >>>>>>>> On 23/02/2011, at 1:32 AM, Pedro Cunha wrote: >> >>>>>>>>> Hello there, >> >>>>>>>>> I'm having a weird slow indexing. Tried looking for similar issues on >>>>>>>>> this group and couldn't find. >> >>>>>>>>> When I index an has_many through association, perfomance decays a lot. >>>>>>>>> To help trace this issue I ll give as much info as I can. >> >>>>>>>>> So the association looks like this: >> >>>>>>>>> Restaurant (1 : N ) Menu Categories >>>>>>>>> Menu Categories ( 1 : N ) Menu Items >> >>>>>>>>> Index on table menu_categories for restaurant_id (DB level) >>>>>>>>> Index on table menu_items for menu_category_id (DB level) >> >>>>>>>>> define_index on restaurant >>>>>>>>> ... >>>>>>>>> indexes menu_categories.menu_items(:name), :as => :menu_item_name >>>>>>>>> indexes menu_categories.menu_items(:subtitle), :as >>>>>>>>> => :menu_item_subtitle >>>>>>>>> indexes menu_categories.menu_items(:description), :as >>>>>>>>> => :menu_item_description >>>>>>>>> ... >>>>>>>>> end >> >>>>>>>>> Without the indexes before, and with a lot of other stuff I get this >>>>>>>>> perfomance: >> >>>>>>>>> indexing index 'restaurant_core'... >>>>>>>>> collected 12 docs, 3.0 MB >>>>>>>>> collected 122936 attr values >>>>>>>>> sorted 0.2 Mvalues, 100.0% done >>>>>>>>> sorted 2.2 Mhits, 100.0% done >>>>>>>>> total 12 docs, 2973437 bytes >>>>>>>>> total 16.741 sec, 177611 bytes/sec, 0.71 docs/sec >>>>>>>>> distributed index 'restaurant' can not be directly indexed; skipping. >> >>>>>>>>> With the indexes >> >>>>>>>>> indexing index 'restaurant_core'... >>>>>>>>> collected 12 docs, 15.9 MB >>>>>>>>> collected 423920 attr values >>>>>>>>> sorted 0.8 Mvalues, 100.0% done >>>>>>>>> sorted 11.4 Mhits, 100.0% done >>>>>>>>> total 12 docs, 15885288 bytes >>>>>>>>> total 320.783 sec, 49520 bytes/sec, 0.03 docs/sec >>>>>>>>> distributed index 'restaurant' can not be directly indexed; skipping. >> >>>>>>>>> table ids are pretty small so I think that issue is not causing >>>>>>>>> problems.. still I tried adding the sql step config and performance >>>>>>>>> stays the same >> >>>>>>>>> Properties on define_index are: >> >>>>>>>>> set_property :min_prefix_len => 3 >>>>>>>>> set_property :stopwords => "#{RAILS_ROOT}/config/sphinx/ >>>>>>>>> stoplist.txt" >>>>>>>>> set_property :delta => ThinkingSphinx::Deltas::ResqueDelta >> >>>>>>>>> Tried commenting min_prefix_len and perfomance keeps the same.. >> >>>>>>>>> Running out of ideas :( >> >>>>>>>>> Best regards, >>>>>>>>> Pedro >> >>>>>>>>> -- >>>>>>>>> You received this message because you are subscribed to the Google >>>>>>>>> Groups "Thinking Sphinx" group. >>>>>>>>> To post to this group, send email to [email protected]. >>>>>>>>> To unsubscribe from this group, send email to >>>>>>>>> [email protected]. >>>>>>>>> For more options, visit this group >>>>>>>>> athttp://groups.google.com/group/thinking-sphinx?hl=en. >> >>>>>>> -- >>>>>>> You received this message because you are subscribed to the Google >>>>>>> Groups "Thinking Sphinx" group. >>>>>>> To post to this group, send email to [email protected]. >>>>>>> To unsubscribe from this group, send email to >>>>>>> [email protected]. >>>>>>> For more options, visit this group >>>>>>> athttp://groups.google.com/group/thinking-sphinx?hl=en. >> >>>>> -- >>>>> You received this message because you are subscribed to the Google Groups >>>>> "Thinking Sphinx" group. >>>>> To post to this group, send email to [email protected]. >>>>> To unsubscribe from this group, send email to >>>>> [email protected]. >>>>> For more options, visit this >> >> ... >> >> read more » > > -- > You received this message because you are subscribed to the Google Groups > "Thinking Sphinx" group. > To post to this group, send email to [email protected]. > To unsubscribe from this group, send email to > [email protected]. > For more options, visit this group at > http://groups.google.com/group/thinking-sphinx?hl=en. > -- You received this message because you are subscribed to the Google Groups "Thinking Sphinx" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/thinking-sphinx?hl=en.
