Re: [EXT] Re: Looking for tips on improving full-text search quality in Postgres

2022-03-04 Thread Bruce Momjian
On Fri, Mar 4, 2022 at 10:22:11PM +0530, Atri Sharma wrote: > TF/IDF should be pretty simple to implement IMO. > > And no, Solr does not give preference to prior documents.  > > However, Solr allows you to "boost" specific terms, thus creating the > impression of preference.  Postgres can do th

Re: [EXT] Re: Looking for tips on improving full-text search quality in Postgres

2022-03-04 Thread Atri Sharma
TF/IDF should be pretty simple to implement IMO. And no, Solr does not give preference to prior documents. However, Solr allows you to "boost" specific terms, thus creating the impression of preference. On Fri, 4 Mar 2022, 22:15 Bruce Momjian, wrote: > On Fri, Mar 4, 2022 at 11:43:57AM -0500,

Re: [EXT] Re: Looking for tips on improving full-text search quality in Postgres

2022-03-04 Thread Bruce Momjian
On Fri, Mar 4, 2022 at 11:43:57AM -0500, Tom Lane wrote: > "Bayer, Samuel" writes: > > One concrete question, I suppose, is: the classic TF/IDF search strategy > > relies on inverse document frequency, which looks across the corpus. I > > can't tell whether that corpus-wide frequency informatio

Re: [EXT] Re: Looking for tips on improving full-text search quality in Postgres

2022-03-04 Thread Bruce Momjian
On Fri, Mar 4, 2022 at 11:39:39AM -0500, Bayer, Samuel wrote: > I've tried both ranking functions. I've tried a variety of the > normalization settings. I'm using the standard English language > configuration. Postgres 13. > > I do understand your FTS philosophy - I suppose I'm looking for > guidan

Re: [EXT] Re: Looking for tips on improving full-text search quality in Postgres

2022-03-04 Thread Tom Lane
"Bayer, Samuel" writes: > One concrete question, I suppose, is: the classic TF/IDF search strategy > relies on inverse document frequency, which looks across the corpus. I can't > tell whether that corpus-wide frequency information is taken into account in > either ranking function. The docume

Re: [EXT] Re: Looking for tips on improving full-text search quality in Postgres

2022-03-04 Thread Bayer, Samuel
I've tried both ranking functions. I've tried a variety of the normalization settings. I'm using the standard English language configuration. Postgres 13. I do understand your FTS philosophy - I suppose I'm looking for guidance about how best to approximate the search capability in Solr using t

Re: [EXT] Re: Looking for tips on improving full-text search quality in Postgres

2022-03-04 Thread Tom Lane
Bruce Momjian writes: > On Fri, Mar 4, 2022 at 10:41:16AM -0500, Bayer, Samuel wrote: >> I apologize for not being able to be more specific. > I know it is hard to quantify. Is it possible that Postgres is treating > all the terms equally, while Solr is prioritizing terms that are earlier > in t

Re: [EXT] Re: Looking for tips on improving full-text search quality in Postgres

2022-03-04 Thread Bruce Momjian
On Fri, Mar 4, 2022 at 10:41:16AM -0500, Bayer, Samuel wrote: > Example anecdote: the documents I'm searching come with metadata > (e.g., title), which I'm not indexing specially (not a separate field, > just part of the raw text of the document). When I search even for > single terms, and look at

Re: [EXT] Re: Looking for tips on improving full-text search quality in Postgres

2022-03-04 Thread Bayer, Samuel
Fair question. Not worried so much about speed. Looking, essentially, at precision by rank (i.e., average precision and variants). I have not explored the contrasts between the default English language configuration in Postgres and the one in Solr - I have no reason to believe that there's anyt

Re: [EXT] Re: Looking for tips on improving full-text search quality in Postgres

2022-03-04 Thread Atri Sharma
Can you define what "high quality" is? Are you referring to precision? Or recall? Or speed? Or query dialect? On Fri, Mar 4, 2022 at 8:59 PM Bayer, Samuel wrote: > > Thanks for replying. My problem is that I can't provide enough guidance on > what isn't working, because (a) I don't have good en

Re: [EXT] Re: Looking for tips on improving full-text search quality in Postgres

2022-03-04 Thread Bayer, Samuel
Thanks for replying. My problem is that I can't provide enough guidance on what isn't working, because (a) I don't have good enough intuitions about how the normalization options are expected to affect the results, and (b) I can't identify a specific missing function - I'm just observing that I

Re: Looking for tips on improving full-text search quality in Postgres

2022-03-04 Thread Bruce Momjian
On Fri, Mar 4, 2022 at 08:10:48AM -0500, Bayer, Samuel wrote: > Hi all - > > When I have a need for both sophisticated database querying and > full-text search, I'd rather not stand up a technology stack with > multiple tools (e.g., Postgres and Apache Solr, or Postgres and > ElasticSearch with a z

Looking for tips on improving full-text search quality in Postgres

2022-03-04 Thread Bayer, Samuel
Hi all - When I have a need for both sophisticated database querying and full-text search, I'd rather not stand up a technology stack with multiple tools (e.g., Postgres and Apache Solr, or Postgres and ElasticSearch with a zomboDB bridge). So I've been looking at the Postgres full-text search