Roberto,

>Though I cannot use DLLs since it is an iPhone iOS (MacOSX) 
>operational system.

I made it DLL by default at build time since it fits my needs.  You can 
still compile the extension (or part of it) as a standard .o obj and 
statically link it into your application.

>I was hoping for a collation callback that is called for all 
>characters, not only the first.

A collation always works on the full arguments it is supplied with, 
i.e. on the whole strings that are to ba collated.  This behavior is 
expected and fully docuented, like Igor also points out in his recent 
reply.

>Shouldn't sqlite3_create_collation be called for every single character?

No it has to be called once for each connection and every external (not 
already defined by default in the SQLite core) collation function you 
require.

>  Let's say the comparing names are "São Paulo" and "Santos". 
> ->  SELECT * FROM Game WHERE TeamHome = 'SANTOS' COLLATE anyCIAI;

I have no idea what anyCIAI means in this context.  The extension I 
proposed offers 4 new collation functions (NOCASE which overrides the 
builtin NOCASE, UNACCENTED, NAMES and NUMERICS).  Since these 
internally use a Windows call, you can't use their code as is.  What I 
would do in your situation is write a new collation relying on the 
unaccenting internal functions provided in the extension.

>The LOG function shows a comparison between S and other first char 
>only only:
>
>41 65 A - 53 83 S = -18
>43 67 C - 53 83 S = -16
>46 70 F - 53 83 S = -13
>53 83 S - 53 83 S = 0
>49 73 I - 53 83 S = -10
>46 70 F - 53 83 S = -13
>50 80 P - 53 83 S = -3
>43 67 C - 53 83 S = -16
>47 71 G - 53 83 S = -12
>I was expecting it to go further in the comparison:"São Paulo" and 
>"Santos" should LOGS - Sã - ao - n -> stops here, not what your looking for
>When using it on ORDER BY, it is clear that only the first char is 
>compared.

I don't know what LOG / LOGS are in this context.

If you also need to search names with uncertain spelling, you can also 
use my typos() function to perform a fuzzy search. Here's a sample of 
its use on a decently populated ZipCodes table (848207 rows):

select pays, zip, ville, region from allcountries where typos(ville, 
'saopaul%') < 3 group by pays, ville, region

RecNo Pays Zip       Ville                    Region
----- ---- --------- ------------------------ --------------------------
     1 AR   6221      LA PAULINA               LA PAMPA
     2 AU   2031      St Pauls                 New South Wales
     3 BR   64670-000 São Julião               Piaui
     4 BR   01000-000 São Paulo                Sao Paulo
     5 BR   97980-000 São Paulo das Missões    Rio Grande do Sul
     6 BR   69600-000 São Paulo de Olivença    Amazonas
     7 BR   59460-000 São Paulo do Potengi     Rio Grande do Norte
     8 ES   22281     La Paul                  Aragon
     9 ES   22471     Laspaules                Aragon
    10 ES   07691     Sa Taulera               Baleares
    11 FR   29400     Lampaul Guimiliau        Bretagne
    12 FR   29810     Lampaul Plouarzel        Bretagne
    13 FR   29830     Lampaul Ploudalmezeau    Bretagne
    14 FR   33390     St Paul                  Aquitaine
    15 FR   61100     St Paul                  Basse-Normandie
    16 FR   87260     St Paul                  Limousin
    17 FR   88170     St Paul                  Lorraine
    18 FR   65150     St Paul                  Midi-Pyrenees
    19 FR   60650     St Paul                  Picardie
    20 FR   06570     St Paul                  Provence-Alpes-Cote D'Azur
    21 FR   73170     St Paul                  Rhone-Alpes
    22 FR   02300     St Paul Aux Bois         Picardie
    23 FR   81220     St Paul Cap De Joux      Midi-Pyrenees
    24 FR   82400     St Paul D Espis          Midi-Pyrenees
 >>> snip >>>
    68 FR   11320     St Paulet                Languedoc-Roussillon
    69 FR   30130     St Paulet De Caisson     Languedoc-Roussillon
    70 FR   43350     St Paulien               Auvergne
    71 GB   EC4       St Paul's                (null)
    72 GB   BR5       St Paul's Cray           (null)
    73 GB   SG4       St Paul's Walden         (null)
    74 HU   3714      Sajópálfala              Borsod-Abaúj-Zemplén
    75 IN   281307    Sahpau                   Uttar Pradesh
    76 IN   328027    Saipau                   Rajasthan
    77 IN   171006    Sanjauli                 Himachal Pradesh
    78 IT   39050     St.Paul                  Trentino-Alto Adige
    79 PK   47701     Sanpal                   Norhern Punajb Rawalpindi
    80 PT   8900-121  Sapal                    Faro
    81 PT   4560-042  Sopal                    Porto
    82 PT   2705-738  São Julião               Lisboa
    83 PT   7300-469  São Julião               Portalegre
    84 PT   4560-197  São Julião               Porto
    85 PT   4950-854  São Julião               Viana do Castelo
    86 PT   5400-754  São Julião de Montenegro Vila Real
    87 PT   5300-871  São Julião de Palácios   Bragança
    88 PT   2664-503  São Julião do Tojal      Lisboa
    89 PT   3230-023  São Paulo                Coimbra
    90 PT   4610-370  São Paulo                Porto
    91 PT   6160-130  São Paulo Baixo          Castelo Branco
    92 PT   6160-131  São Paulo Cima           Castelo Branco
    93 RE   97460     St Paul                  (null)

Result obtained by full table scan in 1.7s on a 3-year old PC.


--
<mailto:j...@q-e-d.org>j...@antichoc.net  

_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to