Re: [rt-users] case insensitive searching in Content?
... 3rd try to mail ... On Sun, Jun 06, 2010 at 02:34:58PM -0500, Kenneth Marshall wrote: Hi Zito, You can look in the mailing list archives for several similar discoveries on how MySQL works with content searches and the longblob data type. I would definitely recommend using PostgreSQL for its full-text indexing available in the wiki. Of course, I am partial... :) Hi Ken, thanks for the suggestion. I red the wiki page http://wiki.bestpractical.com/view/PostgreSQLFullText already. I'm definitely going to try this way, also patching DBIx::SearchBuilder is a bit hack, but ILIKE is too slow on Pg. Regards -- Zito Discover RT's hidden secrets with RT Essentials from O'Reilly Media. Buy a copy at http://rtbook.bestpractical.com
Re: [rt-users] case insensitive searching in Content?
On Tue, Jun 08, 2010 at 09:10:23AM +0200, V?clav Ovs?k wrote: ... 3rd try to mail ... On Sun, Jun 06, 2010 at 02:34:58PM -0500, Kenneth Marshall wrote: Hi Zito, You can look in the mailing list archives for several similar discoveries on how MySQL works with content searches and the longblob data type. I would definitely recommend using PostgreSQL for its full-text indexing available in the wiki. Of course, I am partial... :) Hi Ken, thanks for the suggestion. I red the wiki page http://wiki.bestpractical.com/view/PostgreSQLFullText already. I'm definitely going to try this way, also patching DBIx::SearchBuilder is a bit hack, but ILIKE is too slow on Pg. Regards -- Zito Hi Zito, ILIKE is fine on PostgreSQL as long as the search is anchored on the left and there is an index. Otherwise it will do a sequential scan of the attachments table to find matching records. The wiki entries for full-text indexing with PostgreSQL and Oracle can avoid the sequential scan through the use of the database's full-text index support. I do agree that patching the actual DBIx::SearchBuilder is not preferable. Unfortunately, since that is the API that is used to provide a database abstraction layer and it does not yet include the ability to specify full-text indexing details, you have to patch it. If it is any consolation, the patch is much, much simpler (trivial) for the PostgreSQL support versus the version that is needed for OracleText. :) Regards, Ken Discover RT's hidden secrets with RT Essentials from O'Reilly Media. Buy a copy at http://rtbook.bestpractical.com
Re: [rt-users] case insensitive searching in Content?
Hi Ken, On Tue, Jun 08, 2010 at 07:48:15AM -0500, Kenneth Marshall wrote: ILIKE is fine on PostgreSQL as long as the search is anchored on the left and there is an index. Otherwise it will do a sequential scan of the attachments table to find matching records. The wiki entries for full-text indexing with PostgreSQL and Oracle can avoid the sequential scan through the use of the database's full-text thanks for your work on PostgreSQLFullText wiki page! Good starting point. I'm playing with the fulltext on one instance already. I must discuss the possibility to use the fulltext on Pg with my leader. The results are different from those get from ILIKE. I tried to find 'cisco' for example :). Normal search returned 25 tickets, but fulltext only 15. Fulltext parser did tokens from URL and www.cisco.com was a lexem as is (not broken further). I can't say this is fine or not, it is difference. Users must know this - how to write the query correctly. index support. I do agree that patching the actual DBIx::SearchBuilder is not preferable. Unfortunately, since that is the API that is used to provide a database abstraction layer and it does not yet include the ability to specify full-text indexing details, you have to patch it. If it is any consolation, the patch is much, much simpler (trivial) for the PostgreSQL support versus the version that is needed for OracleText. :) I completely understand. I put the package libdbix-searchbuilder-perl (Debian) on hold in aptitude, so sysadmin should remember, that this package needs a special handling. I did a copy of SearchBuilder.pm file into rt-prefix/local/lib/DBIx/ and did a modification. This perl path is searched first, so a modification is beside other RT code... I have prepared a script for converting data from Mysql to Pg and for adding/removing fulltext support based on your wiki page. Maybe it can be added to wiki for someone else to play with fulltext and Pg. I must to test it a bit and switch some fixed parts into conditional blocks controlled by command-line options. Regards -- Zito Discover RT's hidden secrets with RT Essentials from O'Reilly Media. Buy a copy at http://rtbook.bestpractical.com
Re: [rt-users] case insensitive searching in Content?
On Tue, Jun 08, 2010 at 04:57:25PM +0200, V?clav Ovs?k wrote: Hi Ken, On Tue, Jun 08, 2010 at 07:48:15AM -0500, Kenneth Marshall wrote: ILIKE is fine on PostgreSQL as long as the search is anchored on the left and there is an index. Otherwise it will do a sequential scan of the attachments table to find matching records. The wiki entries for full-text indexing with PostgreSQL and Oracle can avoid the sequential scan through the use of the database's full-text thanks for your work on PostgreSQLFullText wiki page! Good starting point. I'm playing with the fulltext on one instance already. I must discuss the possibility to use the fulltext on Pg with my leader. The results are different from those get from ILIKE. I tried to find 'cisco' for example :). Normal search returned 25 tickets, but fulltext only 15. Fulltext parser did tokens from URL and www.cisco.com was a lexem as is (not broken further). I can't say this is fine or not, it is difference. Users must know this - how to write the query correctly. Correct, the results depend on the parsing rules that are used by the full-text configuration. These rules can be adapted to your needs, but I do agree that some explanation about common types of query problems would be worthwhile. Regards, Ken Discover RT's hidden secrets with RT Essentials from O'Reilly Media. Buy a copy at http://rtbook.bestpractical.com
[rt-users] case insensitive searching in Content?
Hi, I'm preparing migration from RT 3.2.2 + Mysql 3.x to RT 3.8.8 + Mysql or Pg. I just discovered a problem with the case insensitive searching in the column Attachments.Content (longblob) while playing with a new RT instance. Ticket SQL query Content like 'outlook' on our current (old) RT instance, returns some result set. Column Content has data-type longtext in the old database schema, so every letter case combination are found. On a new testing platform RT 3.8.8 + Mysql only partial set of tickets is returned with the same query. I must to modify the query to Content like 'outlook' or Content like 'Outlook' at least to see the most of the tickets. On the other hand RT 3.8.8 + Pg has Content column data-type text, so case-insensitive searching works, but there is another issue with non UTF-8 attachments encoded in quoted-printable, but this is a minor problem. Pg database behaves better than Mysql in this aspect. Is there any intention to detach non textual data from the Content column into e.g. ContentB in future? Have anyone some knowledge or suggestion about this? Best Regards -- Zito Discover RT's hidden secrets with RT Essentials from O'Reilly Media. Buy a copy at http://rtbook.bestpractical.com