Do not know if this mail got lost in between or no one noticed it!
On Thu, 2010-12-23 at 11:05 +0530, Sushant Sinha wrote:
Just a reminder that this patch is discussing how to break url, emails
etc into its components.
>
> On Mon, Oct 4, 2010 at 3:54 AM, Tom Lane wrote:
> [ sorry for no
Just a reminder that this patch is discussing how to break url, emails etc
into its components.
On Mon, Oct 4, 2010 at 3:54 AM, Tom Lane wrote:
> [ sorry for not responding on this sooner, it's been hectic the last
> couple weeks ]
>
> Sushant Sinha writes:
> >> I looked at this patch a bit.
[ sorry for not responding on this sooner, it's been hectic the last
couple weeks ]
Sushant Sinha writes:
>> I looked at this patch a bit. I'm fairly unhappy that it seems to be
>> inventing a brand new mechanism to do something the ts parser can
>> already do. Why didn't you code the url-par
On Wed, Sep 29, 2010 at 1:29 AM, Sushant Sinha wrote:
> Any updates on this?
>
>
> On Tue, Sep 21, 2010 at 10:47 PM, Sushant Sinha
> wrote:
>>
>> > I looked at this patch a bit. I'm fairly unhappy that it seems to be
>> > inventing a brand new mechanism to do something the ts parser can
>> > alr
Any updates on this?
On Tue, Sep 21, 2010 at 10:47 PM, Sushant Sinha wrote:
> > I looked at this patch a bit. I'm fairly unhappy that it seems to be
> > inventing a brand new mechanism to do something the ts parser can
> > already do. Why didn't you code the url-part mechanism using the
> > ex
> I looked at this patch a bit. I'm fairly unhappy that it seems to be
> inventing a brand new mechanism to do something the ts parser can
> already do. Why didn't you code the url-part mechanism using the
> existing support for compound words?
I am not familiar with compound word implementatio
Sushant Sinha writes:
> For the headline generation to work properly, email/file/url/host need
> to become skip tokens. Updating the patch with that change.
I looked at this patch a bit. I'm fairly unhappy that it seems to be
inventing a brand new mechanism to do something the ts parser can
alre
For the headline generation to work properly, email/file/url/host need
to become skip tokens. Updating the patch with that change.
-Sushant.
On Sat, 2010-09-04 at 13:25 +0530, Sushant Sinha wrote:
> Updating the patch with emitting parttoken and registering it with
> snowball config.
>
> -Sushan
Updating the patch with emitting parttoken and registering it with
snowball config.
-Sushant.
On Fri, 2010-09-03 at 09:44 -0400, Robert Haas wrote:
> On Wed, Sep 1, 2010 at 2:42 AM, Sushant Sinha wrote:
> > I have attached a patch that emits parts of a host token, a url token,
> > an email token
On Wed, Sep 1, 2010 at 2:42 AM, Sushant Sinha wrote:
> I have attached a patch that emits parts of a host token, a url token,
> an email token and a file token. Further, it makes sure that a
> host/url/email/file token and the first part-token are at the same
> position in tsvector.
You should pr
I have attached a patch that emits parts of a host token, a url token,
an email token and a file token. Further, it makes sure that a
host/url/email/file token and the first part-token are at the same
position in tsvector.
The two major changes are:
1. Tokenization changes: The patch exploits the
On Mon, Aug 2, 2010 at 10:21 AM, Kevin Grittner
wrote:
> Sushant Sinha wrote:
>
>> Yes thats what I am planning to do. I just wanted to see if anyone
>> can help me in estimating whether this is doable in the current
>> parser or I need to write a new one. If possible, then some idea
>> on how to
Sushant Sinha wrote:
> Yes thats what I am planning to do. I just wanted to see if anyone
> can help me in estimating whether this is doable in the current
> parser or I need to write a new one. If possible, then some idea
> on how to go about implementing?
The current tsearch parser is a stat
Sushant Sinha writes:
>> This would needlessly increase the number of tokens. Instead you'd
>> better make it work like compound word support, having just "wikipedia"
>> and "org" as tokens.
> The current text parser already returns url and url_path. That already
> increases the number of uniqu
On Mon, 2010-08-02 at 09:32 -0400, Robert Haas wrote:
> On Mon, Aug 2, 2010 at 9:12 AM, Sushant Sinha wrote:
> > The current text parser already returns url and url_path. That already
> > increases the number of unique tokens. I am only asking for adding of
> > normal english words as well so that
On Mon, Aug 2, 2010 at 9:12 AM, Sushant Sinha wrote:
> The current text parser already returns url and url_path. That already
> increases the number of unique tokens. I am only asking for adding of
> normal english words as well so that if someone types only "wikipedia"
> he gets a match.
[...]
>
Hi,
On 08/02/2010 03:12 PM, Sushant Sinha wrote:
The current text parser already returns url and url_path. That already
increases the number of unique tokens.
Well, I think I simply turned that off to be able to search for plain
words. It still works for complete URLs, those are just treated
> On 08/01/2010 08:04 PM, Sushant Sinha wrote:
> > 1. We do not have separate tokens "wikipedia" and "org"
> > 2. If we have the two tokens we should have them at adjacent position so
> > that a phrase search for "wikipedia org" should work.
>
> This would needlessly increase the number of tokens.
Hi,
On 08/01/2010 08:04 PM, Sushant Sinha wrote:
1. We do not have separate tokens "wikipedia" and "org"
2. If we have the two tokens we should have them at adjacent position so
that a phrase search for "wikipedia org" should work.
This would needlessly increase the number of tokens. Instead y
Currently the english parser in text search does not support multiple
words in the same position. Consider a word "wikipedia.org". The text
search would return a single token "wikipedia.org". However if someone
searches for "wikipedia org" then there will not be a match. There are
two problems here
20 matches
Mail list logo