Thanks for your very detailed response Mark. Most helpful.
On Wed, Apr 25, 2012 at 10:14 AM, Mark Goodge <m...@good-stuff.co.uk> wrote: > On 24/04/2012 17:24, Tompkins Neil wrote: > >> How about if I want to only return postal codes that are like W1U 8JE >> not W13 0SU. >> >> Because in this example I have W1 as the postal code and W13 is the other >> postal code >> > > No, you don't. In this example you have W1U as one outbound code and W13 > as the other. > > W1U postcodes are not a subset of W1 postcodes, any more than IP27 > postcodes are a subset of IP2 postcodes. The fact that in W1U the district > segment is in the form of NA rather than NN doesn't change the fact that > it's an indivisible two-character code. > > So I think the first question has to be, why do you want to get W1 as a > particular substring from the postcode W1U 8JE? > > British postcodes have a structure which is easy for humans to understand, > although (unfortunately) rather hard to parse automatically. Essentially, > every full postcode contains four elements: > > Area code: one or two alpha characters, either A or AA > District code: one or two alphanumeric characters the first of which is > always numeric, either N, NN or NA > Sector code: single numeric character, always N > Walk code: two alpha characters, always AA > > It's customary, but not part of the formal specification, to insert > whitespace between the District and Sector codes. > > So, given the postcode WC1H 8EJ, we have: > > Area: WC > District: 1H > Sector: 8 > Walk: EJ > > Taken together, the first two sections form the "outbound" part of the > postcode, and the second two form the "inbound". (That is, the first two > identify the destination sorting depot that the originating depot will send > the post to, and the second two are used by the destination depot to make > the actual delivery). > > The reason for mentioning this is that postcodes, having a wide range of > possible formats, are not easy to handle with simple substring searches if > you're trying to extract outbound codes from a full postcode. It can be > done with regular expressions, but you have to be wary of assuming that the > space between District and Sector will always be present as, particularly > if you're getting data from user input, it might not be. > > In my own experience (which is quite extensive, as I've done a lot of work > with systems, such as online retail, which use postcodes as a key part of > the data), I've always found it simpler to pre-process the postcodes prior > to inserting them into the database in order to ensure they have a > consistent format (eg, inserting a space if none exists). That then makes > it easy to select an outbound code, as you can use the space as a boundary. > But if you want to be able to go further up the tree and select area codes > (eg, distinguishing between EC, WC and W) then it's harder, as you have to > account for the fact that some are two characters and some are only one. > You can do it with a regular expression, taking everything prior to the > first digit, but it's a lot easier in this case to extract the area code > prior to inserting the data into the database and store the area code in a > separate column. > > Mark > -- > Sent from my ZX Spectrum HD > http://mark.goodge.co.uk > > -- > MySQL General Mailing List > For list archives: http://lists.mysql.com/mysql > To unsubscribe: http://lists.mysql.com/mysql > >