Hi Gedeon

Unfortunately, it's Rails/ActiveRecord that decides how to build the joins, so 
yes, if you want to change those, you will need to edit the query manually, 
then run rake ts:reindex instead of ts:index to avoid TS overwriting your 
changes.

-- 
Pat

On 13/05/2010, at 9:26 PM, Gedeon wrote:

> We might have found the problem. It seems to be caused by the
> memberships.
> Is it possible to change the LEFT OUTER JOIN in the query from the
> Rails application? Or do we have to tweak it manually and reindex the
> code without generating the config file?
> 
> Gedeon
> 
> On May 13, 4:48 pm, Gedeon <[email protected]> wrote:
>> Some update:
>> I think i found the smallest group of indexes that can cause the
>> problem. Removing any of them seems to solve everything (disk usage
>> limited to a few MB):
>> 
>>     # fields
>>     indexes writeboards(:name), :as => :writeboard_name
>>     indexes project_documents.filename, :as => :file_name
>>     # attributes
>>     has memberships.user_id, :as => :member_ids, :type => :multi
>> 
>> My models have the following 'indexed' relationships:
>> 
>> Project:
>> has_many :memberships, :dependent => :destroy
>> has_many :writeboards, :order => 'created_at DESC', :dependent
>> => :destroy
>> has_many :project_documents, :order => 'created_at DESC', :dependent
>> => :destroy
>> 
>> Membership:
>> belongs_to :project
>> 
>> Writeboard:
>> belongs_to :project
>> 
>> ProjectDocument:
>> belongs_to :project
>> 
>> And the query in the config file is much smaller and maybe easier to
>> understand. I still cannot see what's wrong with it, though:
>>  SELECT SQL_NO_CACHE `projects`.`id` *5 +4 AS `id` ,
>> GROUP_CONCAT( DISTINCT IFNULL( `writeboards`.`name` , '0' )
>> SEPARATOR ' ' ) AS `writeboard_name` , GROUP_CONCAT( DISTINCT
>> IFNULL( `project_documents`.`filename` , '0' )
>> SEPARATOR ' ' ) AS `file_name` , `projects`.`id` AS
>> `sphinx_internal_id` , 3759073650 AS `class_crc` , '3759073650' AS
>> `subclass_crcs` , 0 AS `sphinx_deleted` , GROUP_CONCAT( DISTINCT
>> IFNULL( `memberships`.`user_id` , '0' )
>> SEPARATOR ',' ) AS `member_ids`
>> FROM `projects`
>> LEFT OUTER JOIN `writeboards` ON writeboards.project_id = projects.id
>> LEFT OUTER JOIN `project_documents` ON project_documents.project_id =
>> projects.id
>> LEFT OUTER JOIN `memberships` ON memberships.project_id = projects.id
>> WHERE `projects`.`id` >=8
>> AND `projects`.`id` <=8
>> AND `projects`.`delta` =0
>> GROUP BY `projects`.`id`
>> ORDER BY NULL
>> LIMIT 0 , 30
>> 
>> It took a bit more than 1 minute to complete in phpmyadmin (on only
>> one project, id = 8)
>> 
>> Now, as a reference, the same index without project documents:
>>  SELECT SQL_NO_CACHE `projects`.`id` *5 +4 AS `id` ,
>> GROUP_CONCAT( DISTINCT IFNULL( `writeboards`.`name` , '0' )
>> SEPARATOR ' ' ) AS `writeboard_name` , `projects`.`id` AS
>> `sphinx_internal_id` , 3759073650 AS `class_crc` , '3759073650' AS
>> `subclass_crcs` , 0 AS `sphinx_deleted` , GROUP_CONCAT( DISTINCT
>> IFNULL( `memberships`.`user_id` , '0' )
>> SEPARATOR ',' ) AS `member_ids`
>> FROM `projects`
>> LEFT OUTER JOIN `writeboards` ON writeboards.project_id = projects.id
>> LEFT OUTER JOIN `memberships` ON memberships.project_id = projects.id
>> WHERE `projects`.`id` >=8
>> AND `projects`.`id` <=8
>> AND `projects`.`delta` =0
>> GROUP BY `projects`.`id`
>> ORDER BY NULL
>> LIMIT 0 , 30
>> 
>> Takes 0.1 sec. And I only have about 20 project documents...
>> 
>> There is definitely something funky going on there. However, the
>> simplicity of the indexes, relationships and fields leave me
>> clueless...
>> 
>> Gedeon
>> 
>> On May 13, 3:10 pm,Gedeon<[email protected]> wrote:
>> 
>>> Well, the error simply means - in this case - that the disk is full.
>>> It used the 3GB i have available on the partition. The table cannot be
>>> repaired, as it is a temporary table created by mysql to gather the
>>> results of the query.
>> 
>>> I'm currently emptying tables one by one and indexing every time to
>>> see if any of the causes this problem.
>> 
>>> On May 13, 12:00 pm, Pat Allan <[email protected]> wrote:
>> 
>>>> That error message looks like a MySQL issue, not a Sphinx issue - I'd be 
>>>> hunting on the MySQL site for that (and following its suggestion of 
>>>> repairing the tables/db).
>> 
>>>> --
>>>> Pat
>> 
>>>> On 13/05/2010, at 2:51 PM,Gedeonwrote:
>> 
>>>>> Hi Pat,
>> 
>>>>> I tried increasing the memory up to 512M, but it does not make it any
>>>>> better.
>> 
>>>>> I can reproduce the error in phpmyadmin by running the query found in
>>>>> the sphinx config file and changing the $start and $end id limits.
>>>>> This is the query and error i get:
>> 
>>>>> Error
>> 
>>>>> SQL query: Documentation
>> 
>>>>> SELECT SQL_NO_CACHE `projects`.`id` *5 +4 AS `id` , `projects`.`name`
>>>>> AS `name` , `projects`.`mission` AS `mission` , GROUP_CONCAT( DISTINCT
>>>>> IFNULL( `writeboards`.`name` , '0' )
>>>>> SEPARATOR ' ' ) AS `writeboard_name` , GROUP_CONCAT( DISTINCT
>>>>> IFNULL( `writeboard_contents`.`body` , '0' )
>>>>> SEPARATOR ' ' ) AS `mr_content_body` , GROUP_CONCAT( DISTINCT
>>>>> IFNULL( `writeboard_comments`.`body` , '0' )
>>>>> SEPARATOR ' ' ) AS `mr_comment_body` , GROUP_CONCAT( DISTINCT
>>>>> IFNULL( `folders`.`name` , '0' )
>>>>> SEPARATOR ' ' ) AS `folder_name` , GROUP_CONCAT( DISTINCT
>>>>> IFNULL( `folders`.`description` , '0' )
>>>>> SEPARATOR ' ' ) AS `folder_description` , GROUP_CONCAT( DISTINCT
>>>>> IFNULL( `project_documents`.`filename` , '0' )
>>>>> SEPARATOR ' ' ) AS `file_name` , GROUP_CONCAT( DISTINCT
>>>>> IFNULL( `project_documents`.`description` , '0' )
>>>>> SEPARATOR ' ' ) AS `file_description` , `projects`.`id` AS
>>>>> `sphinx_internal_id` , 3759073650 AS `class_crc` , '3759073650' AS
>>>>> `subclass_crcs` , 0 AS `sphinx_deleted` , GROUP_CONCAT( DISTINCT
>>>>> IFNULL( `memberships`.`user_id` , '0' )
>>>>> SEPARATOR ',' ) AS `member_ids` , IF( projects.private =0, '0',
>>>>> GROUP_CONCAT( memberships.user_id ) ) AS `members_or_public`
>>>>> FROM `projects`
>>>>> LEFT OUTER JOIN `writeboards` ON writeboards.project_id = projects.id
>>>>> LEFT OUTER JOIN `writeboard_contents` ON
>>>>> writeboard_contents.writeboard_id = writeboards.id
>>>>> LEFT OUTER JOIN `writeboard_contents`
>>>>> writeboard_comments_writeboards_join ON ( `writeboards`.`id` =
>>>>> `writeboard_comments_writeboards_join`.`writeboard_id` )
>>>>> LEFT OUTER JOIN `writeboard_comments` ON
>>>>> ( `writeboard_comments`.`writeboard_content_id` =
>>>>> `writeboard_comments_writeboards_join`.`id` )
>>>>> LEFT OUTER JOIN `folders` ON folders.project_id = projects.id
>>>>> LEFT OUTER JOIN `project_documents` ON project_documents.folder_id =
>>>>> folders.id
>>>>> LEFT OUTER JOIN `memberships` ON memberships.project_id = projects.id
>>>>> WHERE `projects`.`id` >=1
>>>>> AND `projects`.`id` <=10
>>>>> AND `projects`.`delta` =0
>>>>> AND projects.inactive =0
>>>>> GROUP BY `projects`.`id`
>>>>> ORDER BY NULL
>>>>> LIMIT 0 , 30
>> 
>>>>> MySQL said: Documentation
>>>>> #126 - Incorrect key file for table 'sql_730_0.MYI'; try to repair it
>> 
>>>>> I will try to find out exactly what part of the query is doing that.
>>>>> If you have any idea, please let me know..
>> 
>>>>> Thanks,
>> 
>>>>> Gedeon
>> 
>>>>> On May 9, 11:51 am, Pat Allan <[email protected]> wrote:
>>>>>> It's probably worth increasing the memory limit for indexer, via the 
>>>>>> mem_limit 
>>>>>> setting:http://freelancing-god.github.com/ts/en/advanced_config.html
>>>>>> (see about halfway down the page)
>> 
>>>>>> Not sure if that will fix things, but it's definitely worth a shot.
>> 
>>>>>> --
>>>>>> Pat
>> 
>>>>>> On 07/05/2010, at 2:31 PM,Gedeonwrote:
>> 
>>>>>>> Hi all,
>> 
>>>>>>> Thanks to Pat, I got the search working properly now. However,
>>>>>>> indexing is now giving me a headache...
>> 
>>>>>>> As explained in my previous threads, I'm trying to index and search
>>>>>>> Projects. It worked fine in development, with a very small test
>>>>>>> database. However, in production (with tables under 2k records),
>>>>>>> indexing seems to take loads of disk space (which is released once
>>>>>>> indexing is over). Some examples:
>> 
>>>>>>>    indexes writeboards.writeboard_comments.body, :as
>>>>>>> => :mr_comment_body (1158 rows, 416kb)
>>>>>>>    indexes folders.project_documents.description, :as
>>>>>>> => :file_description (1457 rows, 352kb))
>> 
>>>>>>>    has memberships.user_id, :as => :member_ids, :type => :multi (652
>>>>>>> rows, 56kb)
>>>>>>>    has "IF(projects.private = 0, '0',
>>>>>>> GROUP_CONCAT( memberships.user_id))", :as => :members_or_public, :type
>>>>>>> => :multi
>> 
>>>>>>>    requires 1.4GB
>>>>>>>    -----------
>>>>>>> Now, the same without :members_or_public:
>>>>>>>    indexes writeboards.writeboard_comments.body, :as
>>>>>>> => :mr_comment_body
>>>>>>>    indexes folders.project_documents.description, :as
>>>>>>> => :file_description
>> 
>>>>>>>    has memberships.user_id, :as => :member_ids, :type => :multi
>> 
>>>>>>>    requires 1.1GB
>>>>>>>    -----------
>>>>>>> Without :member_ids :
>>>>>>>    indexes writeboards.writeboard_comments.body, :as => :mr_comment_body
>>>>>>>        indexes folders.project_documents.description, :as
>>>>>>> => :file_description
>> 
>>>>>>>    75MB
>>>>>>>    ------------
>> 
>>>>>>> Obviously, there is an issue with has memberships.user_id, :as
>>>>>>> => :member_ids, :type => :multi
>> 
>>>>>>> Also, without the :member_ids, adding indexes requires more and more
>>>>>>> disk space:
>>>>>>>    indexes writeboards.name, :as => :writeboard_name
>>>>>>>    indexes writeboards.writeboard_contents.body, :as => :mr_content_body
>>>>>>>    indexes writeboards.writeboard_comments.body, :as => :mr_comment_body
>>>>>>>        indexes folders.project_documents.description, :as
>>>>>>> => :file_description
>>>>>>>    indexes folders.project_documents.filename, :as => :file_name
>>>>>>>    indexes folders.description, :as => :folder_description
>>>>>>>    indexes folders.name, :as => :folder_name
>> 
>>>>>>>    requires 1GB of disk space for indexing.
>> 
>>>>>>> Is that a normal behaviour? If not, has anyone encountered this
>>>>>>> problem? Any idea what I should check to fix it?
>>>>>>> All my tables are pretty small and  "many indexes" + "member_ids"
>>>>>>> always crashes because I do not have enough space on the hard disk.
>>>>>>> Will I need 10Gb of disk space for indexing tables with 10k+ rows?
>>>>>>> All this is without the use of wildcards, infix or prefix. Enabling
>>>>>>> them does not seem to have any effect on the disk space required.
>> 
>>>>>>> --
>>>>>>> You received this message because you are
>> 
>> ...
>> 
>> read more ยป
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "Thinking Sphinx" group.
> To post to this group, send email to [email protected].
> To unsubscribe from this group, send email to 
> [email protected].
> For more options, visit this group at 
> http://groups.google.com/group/thinking-sphinx?hl=en.
> 

-- 
You received this message because you are subscribed to the Google Groups 
"Thinking Sphinx" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/thinking-sphinx?hl=en.

Reply via email to