Hi Jeff, Those are just what I would have expected. I would take one of the results from the second set of 77 and figure out why it is not matching the first query. It does look more and more like a collation problem. What is the collation of an attachment that matches one query but not the other?
Ken On Mon, May 17, 2010 at 03:20:00PM -0400, Jeff Blaine wrote: > Here are the transactions as logged. FWIW, I find it curious > that all of these search failures for us so far are with > search terms that have a '.' in the term... > > LOG: duration: 35.815 ms statement: EXECUTE <unnamed> [PREPARE: SELECT > DISTINCT main.* FROM Tickets main > JOIN Transactions Transactions_1 > ON ( Transactions_1.ObjectId = main.id ) > JOIN Attachments Attachments_2 > ON ( Attachments_2.TransactionId = Transactions_1.id ) > WHERE (Transactions_1.ObjectType = 'RT::Ticket') > AND (main.Status != 'deleted') > AND ( ( Attachments_2.Content ILIKE '%members.linode%' ) AND > main.Owner = '66') > AND (main.Type = 'ticket') > AND (main.EffectiveId = main.id) > ORDER BY main.id ASC ] > > LOG: duration: 38.362 ms statement: EXECUTE <unnamed> [PREPARE: SELECT > DISTINCT main.* FROM Tickets main > JOIN Transactions Transactions_1 > ON ( Transactions_1.ObjectId = main.id ) > JOIN Attachments Attachments_2 > ON ( Attachments_2.TransactionId = Transactions_1.id ) > WHERE (Transactions_1.ObjectType = 'RT::Ticket') > AND (main.Status != 'deleted') > AND ( ( Attachments_2.Content ILIKE '%linode%' ) AND main.Owner = > '66') > AND (main.Type = 'ticket') > AND (main.EffectiveId = main.id) > ORDER BY main.id ASC ] > > On 5/17/2010 3:00 PM, Kenneth Marshall wrote: >> And what queries actually hit the database backend for these >> two? If it is the same SQL query, then the problem is somewhere >> else. If the query differs, try running the query manually to >> see why it is giving unexpected results. Again as others have >> mentioned, collation/encoding could cause these types of differences. >> >> Cheers, >> Ken >> >> On Mon, May 17, 2010 at 02:48:49PM -0400, Jeff Blaine wrote: >>> Another example of this failing: >>> >>> rt list -t ticket "Content like members.linode AND Owner = jsmith" >>> >>> 14 results >>> >>> rt list -t ticket "Content like linode AND Owner = jsmith" >>> >>> 77 results, all of which have "members.linode" in the content. >>> >>> On 4/26/2010 5:32 PM, Jeff Blaine wrote: >>>> On 4/26/2010 5:04 PM, Kenneth Marshall wrote: >>>>> I am not certain what to do. For myself, I would turn on >>>>> SQL statement logging within the database. Set: >>>>> >>>>> log_min_duration_statement = 0 >>>>> >>>>> in your postgresql.conf for the database and run both >>>>> queries in RT. Then turn it back done and pick apart the >>>>> query results to see if the problem is a logic problem or >>>>> possibly an encoding issue which results in "equal" items >>>>> not showing as "equal". Also, it looks like you are just >>>> >>>> These were sequential in the log as a result of my simple >>>> "Content matches foo.com" query via the web GUI. One is >>>> a COUNT and the other is for getting data, apparently. >>>> >>>> I formatted them here for easier reading. >>>> >>>> LOG: duration: 101.115 ms statement: EXECUTE<unnamed> [PREPARE: >>>> SELECT COUNT(DISTINCT main.id) FROM Tickets main >>>> JOIN Transactions Transactions_1 ON ( Transactions_1.ObjectId = main.id >>>> ) >>>> JOIN Attachments Attachments_2 ON ( Attachments_2.TransactionId = >>>> Transactions_1.id ) >>>> WHERE (Transactions_1.ObjectType = 'RT::Ticket') >>>> AND (main.Status != 'deleted') >>>> AND ( ( Attachments_2.Content ILIKE '%foo.com%' ) ) >>>> AND (main.Type = 'ticket') >>>> AND (main.EffectiveId = main.id) >>>> ] >>>> >>>> LOG: duration: 105.350 ms statement: EXECUTE<unnamed> [PREPARE: >>>> SELECT DISTINCT main.* FROM Tickets main >>>> JOIN Transactions Transactions_1 ON ( Transactions_1.ObjectId = main.id >>>> ) >>>> JOIN Attachments Attachments_2 ON ( Attachments_2.TransactionId = >>>> Transactions_1.id ) >>>> WHERE (Transactions_1.ObjectType = 'RT::Ticket') >>>> AND (main.Status != 'deleted') >>>> AND ( ( Attachments_2.Content ILIKE '%foo.com%' ) ) >>>> AND (main.Type = 'ticket') >>>> AND (main.EffectiveId = main.id) >>>> ORDER BY main.id ASC LIMIT 50 >>>> ] >>>> >>>> What "both queries" are you referring to? The RT CLI and the >>>> RT web GUI both fail the same way with my 1 test query. I'm >>>> confused by what you mean there. >>>> >>>> Thanks for the ideas, Ken >>>> >>>>> starting with such a low ticket number, but if you plan >>>>> on using full content searches for any large-ish amount >>>>> of data you should really plan on using the full-text index >>>>> support for PostgreSQL/RT that is described in the wiki. >>>>> Speaking as one whose database was brought to its knees by >>>>> some creative data mining. :) >>>>> >>>>> Good luck and let us know what you find to be the cause. >>>>> Regards, >>>>> >>>>> Ken >>>>> >>>>> >>>>> On Mon, Apr 26, 2010 at 04:54:45PM -0400, Jeff Blaine wrote: >>>>>> On 4/26/2010 4:52 PM, Kenneth Marshall wrote: >>>>>>> Jeff, >>>>>>> >>>>>>> Are you using the Full-text index support from the wiki? >>>>>> >>>>>> Nope. >>>>>> >>>>>> Just RT 3.8.7 + RTFM 2.4.2. No add-ons/tweaks from the wiki. >>>>>> >>>>>>> There were some index bugs that may require you to re-index >>>>>>> to fix, specifically some rows were not reported correctly >>>>>>> as valid matches. I believe that the PostgreSQL release >>>>>>> notes mentioned that need. Maybe that is your problem. We >>>>>>> are using RT-3.8.5 and PostgreSQL 8.4.2 here with the full-text >>>>>>> support without an issue. >>>>>>> >>>>>>> Regards, >>>>>>> Ken >>>>>>> >>>>>>> >>>>>>> On Mon, Apr 26, 2010 at 03:00:39PM -0400, Jeff Blaine wrote: >>>>>>>> On 4/26/2010 2:19 PM, Kenneth Marshall wrote: >>>>>>>>> Well, that knocks out the ACL issue. Do you think that your >>>>>>>>> Mason cache is confused? Maybe stop RT, clear the cache, and >>>>>>>>> restart RT to see if that helps. What DB backend are you using >>>>>>>>> and which version of RT are you running? >>>>>>>> >>>>>>>> RT 3.8.7 >>>>>>>> PostgreSQL as it comes with RHELv5 + updates from yum >>>>>>>> >>>>>>>> Clearing the Mason cache didn't help :| >>>>>>>> >>>>>>>>> >>>>>>>>> Cheers, >>>>>>>>> Ken >>>>>>>>> >>>>>>>>> On Mon, Apr 26, 2010 at 12:47:22PM -0400, Jeff Blaine wrote: >>>>>>>>>> On 4/26/2010 12:29 PM, Raed El-Hames wrote: >>>>>>>>>>> Jeff; >>>>>>>>>>> >>>>>>>>>>> Does your CLI user have permissions on the queue that ticket 39 >>>>>>>>>>> is >>>>>>>>>>> in?? >>>>>>>>>>> login to the web interface with the same cli user and see if you >>>>>>>>>>> can >>>>>>>>>>> view the ticket. >>>>>>>>>> >>>>>>>>>> Yes, it does. >>>>>>>>>> >>>>>>>>>> Again, however, this is not really a report about an anomaly in >>>>>>>>>> the RT CLI. >>>>>>>>>> >>>>>>>>>> The incorrect search results are returned via a web GUI search >>>>>>>>>> of "Content matches foo.com" >>>>>>>>>> >>>>>>>>>> Here, maybe this makes it more clear, showing the same problem >>>>>>>>>> when using the RT CLI: >>>>>>>>>> >>>>>>>>>> [r...@rtsrv1 etc]# /apps/rt/bin/rt list "Content like foo.com" >>>>>>>>>> Query:Content like 'foo.com' >>>>>>>>>> Ticket Owner Queue Age Told Status Requestor Subject >>>>>>>>>> -------------------------------------------------------------------------------- >>>>>>>>>> >>>>>>>>>> 23 mbs Incid 1 wk resolv enVision@ alert >>>>>>>>>> -NICAlert-Secur >>>>>>>>>> [r...@rtsrv1 etc]# >>>>>>>>>> >>>>>>>>>> [r...@rtsrv1 etc]# /apps/rt/bin/rt show 39 | grep foo.com >>>>>>>>>> foo.com blah blah... 1 line... not including in this email >>>>>>>>>> [r...@rtsrv1 etc]# >>>>>>>>>> >>>>>>>>>> [r...@rtsrv1 etc]# /apps/rt/bin/rt show 23 | grep foo.com >>>>>>>>>> foo.com blah blah... not including in this email >>>>>>>>>> foo.com matching lines 66 more times... not including in this >>>>>>>>>> email >>>>>>>>>> [r...@rtsrv1 etc]# >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> Regards; >>>>>>>>>>> Roy >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Jeff Blaine wrote: >>>>>>>>>>>> On 4/26/2010 11:50 AM, Kenneth Marshall wrote: >>>>>>>>>>>>> Hi Jeff, >>>>>>>>>>>>> >>>>>>>>>>>>> There is nothing here that indicates a problem. It looks >>>>>>>>>>>>> like an apples vs. oranges comparison by the time you include >>>>>>>>>>>>> the actual parameters of the search from the web interface >>>>>>>>>>>>> and the rt commandline interface and possible privilege and >>>>>>>>>>>>> ACL differences. You can use DB query logging to figure out >>>>>>>>>>>> >>>>>>>>>>>> I think my original post is being misinterpreted. The 'rt' >>>>>>>>>>>> CLI commands aren't doing a search. They're just showing >>>>>>>>>>>> this list's readers that 'foo.com' does show up in each of >>>>>>>>>>>> the tickets when doing a simple 'rt show<ticket>'. It's >>>>>>>>>>>> not a comparison of "CLI search vs. web search". >>>>>>>>>>>> >>>>>>>>>>>>> what SQL is being used in the web search or the commandline >>>>>>>>>>>>> rt and compare the output piece-wise to put yourself at ease. >>>>>>>>>>>>> Maybe look at the individual components of each of the two >>>>>>>>>>>>> tickets, as well. >>>>>>>>>>>> >>>>>>>>>>>> When viewing the tickets using 'Full headers" and then >>>>>>>>>>>> "Ctrl-F" to examine every instance of 'foo.com' in each ticket >>>>>>>>>>>> shows that both tickets have the 'foo.com' in text/html parts >>>>>>>>>>>> (and only there). >>>>>>>>>>>> >>>>>>>>>>>> Ticket 23 has 67 of those parts and is returned when RT >>>>>>>>>>>> searching >>>>>>>>>>>> for 'foo.com' >>>>>>>>>>>> >>>>>>>>>>>> Ticket 39 has 1 of those parts and is not returned when RT >>>>>>>>>>>> searching >>>>>>>>>>>> for 'foo.com' >>>>>>>>>>>> >>>>>>>>>>>> By "DB query logging" do you mean Set($StatementLog, "DEBUG"); >>>>>>>>>>>> or something? >>>>>>>>>>>> >>>>>>>>>>>> Thanks for the reply, Ken >>>>>>>>>>>> >>>>>>>>>>>> Jeff >>>>>>>>>>>> >>>>>>>>>>>>> Cheers, >>>>>>>>>>>>> Ken >>>>>>>>>>>>> >>>>>>>>>>>>> On Mon, Apr 26, 2010 at 11:21:45AM -0400, Jeff Blaine wrote: >>>>>>>>>>>>>> Does anyone have any suggestions for how to go about >>>>>>>>>>>>>> figuring out what is wrong here? >>>>>>>>>>>>>> >>>>>>>>>>>>>> On 4/22/2010 2:09 PM, Jeff Blaine wrote: >>>>>>>>>>>>>>> RT 3.8.7 >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> A search for 'Content matches foo.com' is returning some >>>>>>>>>>>>>>> tickets >>>>>>>>>>>>>>> and missing others that clearly have foo.com in the Content. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> [r...@rtsrv1 bin]# ./rt show 39 | grep foo.com | wc -l >>>>>>>>>>>>>>>> 1 >>>>>>>>>>>>>>>> [r...@rtsrv1 bin]# >>>>>>>>>>>>>>>> [r...@rtsrv1 bin]# ./rt show 23 | grep foo.com | wc -l >>>>>>>>>>>>>>>> 67 >>>>>>>>>>>>>>>> [r...@rtsrv1 bin]# >>>>>>>>>>>>>>> 23 shows up in the web search results. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> 39 does not. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Any ideas? >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Discover RT's hidden secrets with RT Essentials from O'Reilly >>>>>>>>>>>>>>> Media. >>>>>>>>>>>>>>> Buy a copy at http://rtbook.bestpractical.com >>>>>>>>>>>>>>> >>>>>>>>>>>>>> Discover RT's hidden secrets with RT Essentials from O'Reilly >>>>>>>>>>>>>> Media. >>>>>>>>>>>>>> Buy a copy at http://rtbook.bestpractical.com >>>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Discover RT's hidden secrets with RT Essentials from O'Reilly >>>>>>>>>>>> Media. >>>>>>>>>>>> Buy a copy at http://rtbook.bestpractical.com >>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>>> Discover RT's hidden secrets with RT Essentials from O'Reilly Media. >>>> Buy a copy at http://rtbook.bestpractical.com >>>> >>> >>> Discover RT's hidden secrets with RT Essentials from O'Reilly Media. >>> Buy a copy at http://rtbook.bestpractical.com >>> >> > Discover RT's hidden secrets with RT Essentials from O'Reilly Media. Buy a copy at http://rtbook.bestpractical.com