[sqlite] Porter Stemmer

2012-06-15 Thread Philip Bennefall
Hi all,

Is the algorithm used in the stemming tokenizer in SqLite's fts extension 
equivalent to the C implementation found at 
http://tartarus.org/~martin/PorterStemmer/

?

I am asking this because some sources say that there are improved versions of 
this algorithm released much later than 2000/2001. Does SqLite's implementation 
differ in any significant ways from the C implementation found at the above URL?

Kind regards,

Philip Bennefall
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Porter Stemmer

2012-06-15 Thread Richard Hipp
On Fri, Jun 15, 2012 at 5:51 AM, Philip Bennefall phi...@blastbay.comwrote:

 Hi all,

 Is the algorithm used in the stemming tokenizer in SqLite's fts extension
 equivalent to the C implementation found at
 http://tartarus.org/~martin/PorterStemmer/


The built-in Porter stemmer is a copy/paste from the above link.




 ?

 I am asking this because some sources say that there are improved versions
 of this algorithm released much later than 2000/2001. Does SqLite's
 implementation differ in any significant ways from the C implementation
 found at the above URL?

 Kind regards,

 Philip Bennefall
 ___
 sqlite-users mailing list
 sqlite-users@sqlite.org
 http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users




-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Porter Stemmer

2012-06-15 Thread Philip Bennefall
Thanks, Richard. That's good to know because I am trying to decide whether to 
add a new tokenizer with some custom processing, as opposed to using the built 
in stemmer.

Kind regards,

Philip Bennefall
  - Original Message - 
  From: Richard Hipp 
  To: phi...@blastbay.com ; General Discussion of SQLite Database 
  Sent: Friday, June 15, 2012 1:03 PM
  Subject: Re: [sqlite] Porter Stemmer





  On Fri, Jun 15, 2012 at 5:51 AM, Philip Bennefall phi...@blastbay.com wrote:

Hi all,

Is the algorithm used in the stemming tokenizer in SqLite's fts extension 
equivalent to the C implementation found at 
http://tartarus.org/~martin/PorterStemmer/


  The built-in Porter stemmer is a copy/paste from the above link.

   

?

I am asking this because some sources say that there are improved versions 
of this algorithm released much later than 2000/2001. Does SqLite's 
implementation differ in any significant ways from the C implementation found 
at the above URL?

Kind regards,

Philip Bennefall
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users




  -- 
  D. Richard Hipp
  d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Porter Stemmer

2012-06-15 Thread Philip Bennefall
I had another quick question. If I have built an fts table using the stemmer 
tokenizer, and then I later decide that I want to change to the simple one, is 
there an easy way to do this? I see the rebuild command, can I somehow tell 
that to change the tokenizer as well? I see the reference to custom ones, but 
what about the internal implementations?

Kind regards,

Philip Bennefall
  - Original Message - 
  From: Richard Hipp 
  To: phi...@blastbay.com ; General Discussion of SQLite Database 
  Sent: Friday, June 15, 2012 1:03 PM
  Subject: Re: [sqlite] Porter Stemmer





  On Fri, Jun 15, 2012 at 5:51 AM, Philip Bennefall phi...@blastbay.com wrote:

Hi all,

Is the algorithm used in the stemming tokenizer in SqLite's fts extension 
equivalent to the C implementation found at 
http://tartarus.org/~martin/PorterStemmer/


  The built-in Porter stemmer is a copy/paste from the above link.

   

?

I am asking this because some sources say that there are improved versions 
of this algorithm released much later than 2000/2001. Does SqLite's 
implementation differ in any significant ways from the C implementation found 
at the above URL?

Kind regards,

Philip Bennefall
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users




  -- 
  D. Richard Hipp
  d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Porter Stemmer

2012-06-15 Thread Richard Hipp
On Fri, Jun 15, 2012 at 9:00 AM, Philip Bennefall phi...@blastbay.comwrote:

 I had another quick question. If I have built an fts table using the
 stemmer tokenizer, and then I later decide that I want to change to the
 simple one, is there an easy way to do this? I see the rebuild command,
 can I somehow tell that to change the tokenizer as well? I see the
 reference to custom ones, but what about the internal implementations?


If you change your tokenizer, you need to retokenize all of the source text.




 Kind regards,

 Philip Bennefall
  - Original Message -
  From: Richard Hipp
  To: phi...@blastbay.com ; General Discussion of SQLite Database
  Sent: Friday, June 15, 2012 1:03 PM
  Subject: Re: [sqlite] Porter Stemmer





   On Fri, Jun 15, 2012 at 5:51 AM, Philip Bennefall phi...@blastbay.com
 wrote:

Hi all,

Is the algorithm used in the stemming tokenizer in SqLite's fts
 extension equivalent to the C implementation found at
 http://tartarus.org/~martin/PorterStemmer/


  The built-in Porter stemmer is a copy/paste from the above link.



?

I am asking this because some sources say that there are improved
 versions of this algorithm released much later than 2000/2001. Does
 SqLite's implementation differ in any significant ways from the C
 implementation found at the above URL?

Kind regards,

Philip Bennefall
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users




  --
  D. Richard Hipp
  d...@sqlite.org
 ___
 sqlite-users mailing list
 sqlite-users@sqlite.org
 http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users




-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Porter Stemmer

2012-06-15 Thread Philip Bennefall
I understand that, but let's say that I already have a virtual fts table 
created that I set to use the Porter tokenizer, how then would I go about 
rebuilding and retokenizing this table with the simple tokenizer at a later 
time? Would I need to create an entirely new table? What I'm wondering is 
basically how I might take an existing fts virtual table, change its tokenizer 
and then rebuild the index?

Kind regards,

Philip Bennefall
  - Original Message - 
  From: Richard Hipp 
  To: phi...@blastbay.com ; General Discussion of SQLite Database 
  Sent: Friday, June 15, 2012 3:14 PM
  Subject: Re: [sqlite] Porter Stemmer





  On Fri, Jun 15, 2012 at 9:00 AM, Philip Bennefall phi...@blastbay.com wrote:

I had another quick question. If I have built an fts table using the 
stemmer tokenizer, and then I later decide that I want to change to the simple 
one, is there an easy way to do this? I see the rebuild command, can I 
somehow tell that to change the tokenizer as well? I see the reference to 
custom ones, but what about the internal implementations?


  If you change your tokenizer, you need to retokenize all of the source text.

   

Kind regards,

Philip Bennefall
 - Original Message -
 From: Richard Hipp
 To: phi...@blastbay.com ; General Discussion of SQLite Database
 Sent: Friday, June 15, 2012 1:03 PM
 Subject: Re: [sqlite] Porter Stemmer






 On Fri, Jun 15, 2012 at 5:51 AM, Philip Bennefall phi...@blastbay.com 
wrote:

   Hi all,

   Is the algorithm used in the stemming tokenizer in SqLite's fts 
extension equivalent to the C implementation found at 
http://tartarus.org/~martin/PorterStemmer/


 The built-in Porter stemmer is a copy/paste from the above link.



   ?

   I am asking this because some sources say that there are improved 
versions of this algorithm released much later than 2000/2001. Does SqLite's 
implementation differ in any significant ways from the C implementation found 
at the above URL?

   Kind regards,

   Philip Bennefall
   ___
   sqlite-users mailing list
   sqlite-users@sqlite.org
   http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users




 --
 D. Richard Hipp
 d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users




  -- 
  D. Richard Hipp
  d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Porter Stemmer

2012-06-15 Thread Richard Hipp
On Fri, Jun 15, 2012 at 9:26 AM, Philip Bennefall phi...@blastbay.comwrote:

 I understand that, but let's say that I already have a virtual fts table
 created that I set to use the Porter tokenizer, how then would I go about
 rebuilding and retokenizing this table with the simple tokenizer at a later
 time? Would I need to create an entirely new table? What I'm wondering is
 basically how I might take an existing fts virtual table, change its
 tokenizer and then rebuild the index?


Yes.  You'll need to DROP or RENAME the original table, then CREATE the new
one.



 Kind regards,

 Philip Bennefall
  - Original Message -
  From: Richard Hipp
  To: phi...@blastbay.com ; General Discussion of SQLite Database
   Sent: Friday, June 15, 2012 3:14 PM
  Subject: Re: [sqlite] Porter Stemmer





  On Fri, Jun 15, 2012 at 9:00 AM, Philip Bennefall phi...@blastbay.com
 wrote:

I had another quick question. If I have built an fts table using the
 stemmer tokenizer, and then I later decide that I want to change to the
 simple one, is there an easy way to do this? I see the rebuild command,
 can I somehow tell that to change the tokenizer as well? I see the
 reference to custom ones, but what about the internal implementations?


  If you change your tokenizer, you need to retokenize all of the source
 text.



Kind regards,

Philip Bennefall
 - Original Message -
 From: Richard Hipp
 To: phi...@blastbay.com ; General Discussion of SQLite Database
 Sent: Friday, June 15, 2012 1:03 PM
 Subject: Re: [sqlite] Porter Stemmer






 On Fri, Jun 15, 2012 at 5:51 AM, Philip Bennefall phi...@blastbay.com
 wrote:

   Hi all,

   Is the algorithm used in the stemming tokenizer in SqLite's fts
 extension equivalent to the C implementation found at
 http://tartarus.org/~martin/PorterStemmer/


 The built-in Porter stemmer is a copy/paste from the above link.



   ?

   I am asking this because some sources say that there are improved
 versions of this algorithm released much later than 2000/2001. Does
 SqLite's implementation differ in any significant ways from the C
 implementation found at the above URL?

   Kind regards,

   Philip Bennefall
   ___
   sqlite-users mailing list
   sqlite-users@sqlite.org
   http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users




 --
 D. Richard Hipp
 d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users




  --
  D. Richard Hipp
  d...@sqlite.org
 ___
 sqlite-users mailing list
 sqlite-users@sqlite.org
 http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users




-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Porter Stemmer

2012-06-15 Thread Philip Bennefall
Understood. Thank you very much for your quick help. Now I have all the 
information I need to get coding. And thanks once again for a great library!

Kind regards,

Philip Bennefall
  - Original Message - 
  From: Richard Hipp 
  To: phi...@blastbay.com ; General Discussion of SQLite Database 
  Sent: Friday, June 15, 2012 3:39 PM
  Subject: Re: [sqlite] Porter Stemmer





  On Fri, Jun 15, 2012 at 9:26 AM, Philip Bennefall phi...@blastbay.com wrote:

I understand that, but let's say that I already have a virtual fts table 
created that I set to use the Porter tokenizer, how then would I go about 
rebuilding and retokenizing this table with the simple tokenizer at a later 
time? Would I need to create an entirely new table? What I'm wondering is 
basically how I might take an existing fts virtual table, change its tokenizer 
and then rebuild the index?


  Yes.  You'll need to DROP or RENAME the original table, then CREATE the new 
one.
   

Kind regards,

Philip Bennefall
 - Original Message -
 From: Richard Hipp
 To: phi...@blastbay.com ; General Discussion of SQLite Database

 Sent: Friday, June 15, 2012 3:14 PM
 Subject: Re: [sqlite] Porter Stemmer





 On Fri, Jun 15, 2012 at 9:00 AM, Philip Bennefall phi...@blastbay.com 
wrote:

   I had another quick question. If I have built an fts table using the 
stemmer tokenizer, and then I later decide that I want to change to the simple 
one, is there an easy way to do this? I see the rebuild command, can I 
somehow tell that to change the tokenizer as well? I see the reference to 
custom ones, but what about the internal implementations?


 If you change your tokenizer, you need to retokenize all of the source 
text.



   Kind regards,

   Philip Bennefall
- Original Message -
From: Richard Hipp
To: phi...@blastbay.com ; General Discussion of SQLite Database
Sent: Friday, June 15, 2012 1:03 PM
Subject: Re: [sqlite] Porter Stemmer






On Fri, Jun 15, 2012 at 5:51 AM, Philip Bennefall phi...@blastbay.com 
wrote:

  Hi all,

  Is the algorithm used in the stemming tokenizer in SqLite's fts 
extension equivalent to the C implementation found at 
http://tartarus.org/~martin/PorterStemmer/


The built-in Porter stemmer is a copy/paste from the above link.



  ?

  I am asking this because some sources say that there are improved 
versions of this algorithm released much later than 2000/2001. Does SqLite's 
implementation differ in any significant ways from the C implementation found 
at the above URL?

  Kind regards,

  Philip Bennefall
  ___
  sqlite-users mailing list
  sqlite-users@sqlite.org
  http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users




--
D. Richard Hipp
d...@sqlite.org
   ___
   sqlite-users mailing list
   sqlite-users@sqlite.org
   http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users




 --
 D. Richard Hipp
 d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users




  -- 
  D. Richard Hipp
  d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users