Re: Validation - some things are more alphanumeric than others
My head has now grown back up to the ears. --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "CakePHP" group. To post to this group, send email to cake-php@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/cake-php?hl=en -~--~~~~--~~--~--~---
Re: Validation - some things are more alphanumeric than others
Very true. I'll file a ticket today. -Joel. On Jun 2, 10:54 am, "Chris Hartjes" <[EMAIL PROTECTED]> wrote: > On Mon, Jun 2, 2008 at 10:40 AM, leo <[EMAIL PROTECTED]> wrote: > > > I only meant, but didn't explain very well (at all) that it is no > > longer good enough to assume that the whole world speaks American > > English. While there is a good case for English-based programming > > languages, programming is about end-users. They don't know and don't > > want to know about limitations of 8 bits or ASCII's definition of the > > alphabet, nor do clients or project managers. They just want to be > > able to read things in their own language. We're getting there slowly. > > Programming is also about the people who write it as well, and I'd be > willing to bet that this thread will be ignored if there is no ticket > filed. > > -- > Chris Hartjes > Internet Loudmouth > Motto for 2008: "Moving from herding elephants to handling snakes..." > @TheKeyBoard:http://www.littlehart.net/atthekeyboard --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "CakePHP" group. To post to this group, send email to cake-php@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/cake-php?hl=en -~--~~~~--~~--~--~---
Re: Validation - some things are more alphanumeric than others
On Mon, Jun 2, 2008 at 10:40 AM, leo <[EMAIL PROTECTED]> wrote: > > I only meant, but didn't explain very well (at all) that it is no > longer good enough to assume that the whole world speaks American > English. While there is a good case for English-based programming > languages, programming is about end-users. They don't know and don't > want to know about limitations of 8 bits or ASCII's definition of the > alphabet, nor do clients or project managers. They just want to be > able to read things in their own language. We're getting there slowly. Programming is also about the people who write it as well, and I'd be willing to bet that this thread will be ignored if there is no ticket filed. -- Chris Hartjes Internet Loudmouth Motto for 2008: "Moving from herding elephants to handling snakes..." @TheKeyBoard: http://www.littlehart.net/atthekeyboard --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "CakePHP" group. To post to this group, send email to cake-php@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/cake-php?hl=en -~--~~~~--~~--~--~---
Re: Validation - some things are more alphanumeric than others
Thanks for your research! I only meant, but didn't explain very well (at all) that it is no longer good enough to assume that the whole world speaks American English. While there is a good case for English-based programming languages, programming is about end-users. They don't know and don't want to know about limitations of 8 bits or ASCII's definition of the alphabet, nor do clients or project managers. They just want to be able to read things in their own language. We're getting there slowly. I was actually surprised to find that English isn't the most spoken language. But my intended point was it is only one of three major languages. I am British and I live and work in Spain, producing multilingual websites in Catalan, Spanish, English and, occasionally, German and French. On 2 Juny, 16:13, Joel Perras <[EMAIL PROTECTED]> wrote: > Ok, I'm a big enough man to admit when I'm wrong. > After reading the POSIX specification for alphanumeric characters and > realizing that it accomodates for different character set collations, > I concede that you are correct in stating that 'alphanumeric' should > apply to accented characters. > > However, the fact that English is not the most spoken language in the > world is a vacuous argument; by your reasoning, programming languages > should be written and interpreted in Chinese and Spanish instead of > English. I have yet to find a compiled or interpreted Chinese > programming language, and I don't think they exist AFAIK. > English is ubiquitous in the scientific (and software) community, and > there's not much we can do to change this (my first language is > French), nor would I want to. > > Anyways, here's what I've found: > > After doing a bit of research, I believe the solution to your problem > is this regular expression: /[^[:alnum:]]/u > > Ex: > > //1st element will pass, 2nd ?, 3rd will fail > $data = array("asdf1", "çñasd45", "@#%asd23"); > > foreach ($data as $str): > if (!preg_match("/[^[:alnum:]]/u", $str)) echo $str . " is > valid alphanumeric. \n\n"'; > else echo $str . " IS NOT VALID\n\n"; > endforeach; > > print_r($data); > > //Output: > > asdf1 is valid alphanumeric. > > çñasd45 is valid alphanumeric. > > @[EMAIL PROTECTED] IS NOT VALID > > Array > ( > [0] => asdf1 > [1] => çñasd45 > [2] => @#%asd23 > ) > > On Jun 2, 4:56 am, leo <[EMAIL PROTECTED]> wrote: > > > I did click the link and I followed the references. It would seem that > > a letter like ñ or ç is a letter embellished with a diacritic. > > Therefore the letter is valid and the diacritic should be ignored. Ñ > > is alpha. > > > From the same source, Spanish is spoken as a first language by between > > 322 and 400 million people. English by 375 million. The English > > speaking population of the USA is 215 million. Furthermore, Spanish is > > a Latin language and the ASCII definition is based on the Latin > > alphabet. > > > > >http://en.wikipedia.org/wiki/Alphanumeric > > > Things have moved on from the World as defined by IBM et al > > (thankfully), but unfortunately the World has now become defined, in > > the eyes of many, by Wikipaedia. Wikipaedia is a useful tool used in > > conjunction with others, but dangerous on its own. --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "CakePHP" group. To post to this group, send email to cake-php@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/cake-php?hl=en -~--~~~~--~~--~--~---
Re: Validation - some things are more alphanumeric than others
Ok, I'm a big enough man to admit when I'm wrong. After reading the POSIX specification for alphanumeric characters and realizing that it accomodates for different character set collations, I concede that you are correct in stating that 'alphanumeric' should apply to accented characters. However, the fact that English is not the most spoken language in the world is a vacuous argument; by your reasoning, programming languages should be written and interpreted in Chinese and Spanish instead of English. I have yet to find a compiled or interpreted Chinese programming language, and I don't think they exist AFAIK. English is ubiquitous in the scientific (and software) community, and there's not much we can do to change this (my first language is French), nor would I want to. Anyways, here's what I've found: After doing a bit of research, I believe the solution to your problem is this regular expression: /[^[:alnum:]]/u Ex: //1st element will pass, 2nd ?, 3rd will fail $data = array("asdf1", "çñasd45", "@#%asd23"); foreach ($data as $str): if (!preg_match("/[^[:alnum:]]/u", $str)) echo $str . " is valid alphanumeric. \n\n"'; else echo $str . " IS NOT VALID\n\n"; endforeach; print_r($data); //Output: asdf1 is valid alphanumeric. çñasd45 is valid alphanumeric. @[EMAIL PROTECTED] IS NOT VALID Array ( [0] => asdf1 [1] => çñasd45 [2] => @#%asd23 ) On Jun 2, 4:56 am, leo <[EMAIL PROTECTED]> wrote: > I did click the link and I followed the references. It would seem that > a letter like ñ or ç is a letter embellished with a diacritic. > Therefore the letter is valid and the diacritic should be ignored. Ñ > is alpha. > > From the same source, Spanish is spoken as a first language by between > 322 and 400 million people. English by 375 million. The English > speaking population of the USA is 215 million. Furthermore, Spanish is > a Latin language and the ASCII definition is based on the Latin > alphabet. > > > >http://en.wikipedia.org/wiki/Alphanumeric > > Things have moved on from the World as defined by IBM et al > (thankfully), but unfortunately the World has now become defined, in > the eyes of many, by Wikipaedia. Wikipaedia is a useful tool used in > conjunction with others, but dangerous on its own. --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "CakePHP" group. To post to this group, send email to cake-php@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/cake-php?hl=en -~--~~~~--~~--~--~---
Re: Validation - some things are more alphanumeric than others
I did click the link and I followed the references. It would seem that a letter like ñ or ç is a letter embellished with a diacritic. Therefore the letter is valid and the diacritic should be ignored. Ñ is alpha. From the same source, Spanish is spoken as a first language by between 322 and 400 million people. English by 375 million. The English speaking population of the USA is 215 million. Furthermore, Spanish is a Latin language and the ASCII definition is based on the Latin alphabet. > >http://en.wikipedia.org/wiki/Alphanumeric Things have moved on from the World as defined by IBM et al (thankfully), but unfortunately the World has now become defined, in the eyes of many, by Wikipaedia. Wikipaedia is a useful tool used in conjunction with others, but dangerous on its own. --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "CakePHP" group. To post to this group, send email to cake-php@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/cake-php?hl=en -~--~~~~--~~--~--~---
Re: Validation - some things are more alphanumeric than others
I agree, and I do this myself in dual-language (English/French) localised sites. However, I was simply attempting to illustrate to the original poster that the 'alphanumeric' rule does exactly what it is suppsed to do, as per the definition of 'alphanumeric'. The regex for multibyte alphanumeric characters is (I believe) simple enough, but you need to ensure that the characters are UTF-8 encoded. If you don't know the encoding, then I have no idea how you would be able to perform the validation rule (and if this is something that someone has done before, I will be forever grateful if you explain how to do it), and hence my comment on the 'impossibility' of a universal alphanumeric validation rule. Cake is a wonderful tool, but sometimes you need to get your hands dirty. If my regex-fu is still up to par, I believe /[\w\pL]/u to be the correct multibyte regex to match alphanumeric characters. If this is false, please correct me. -Joel. On May 30, 2:13 pm, Adriano Varoli Piazza <[EMAIL PROTECTED]> wrote: > On May 30, 2:25 pm, Joel Perras <[EMAIL PROTECTED]> wrote: > > >http://en.wikipedia.org/wiki/Alphanumeric > > > In case you don't click the above link: > > > "Alphanumeric is a is portmanteau of alphabetic and numeric and is > > used to describe the collection of Latin letters and Arabic digits > > used by much of western society. There are either 36 (single case) or > > 62 (case-sensitive) alphanumeric characters. The alphanumeric > > character set consists of the numbers 0 to 9 and letters A to Z." > > > I'd love for Cake to have a built-in validation rule that would > > validate all Latin/Cyrillic/Asian/Arabic alphanumeric characters, but > > that's pretty much impossible. > > Validating them all at the same time might be, but localizing a > website to cater to different alphabets or languages is doable. When I > have to validate against Spanish rules, I don't cower in fear of kanji > or arabic, I simply validate alphanumeric characters for my language, > such as á, é, ñ, etc. After all, \w in Perl (or the equivalent regex > entity for a word) is localized, IIRC. > > Saludos > Adriano --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "CakePHP" group. To post to this group, send email to cake-php@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/cake-php?hl=en -~--~~~~--~~--~--~---
Re: Validation - some things are more alphanumeric than others
On May 30, 2:25 pm, Joel Perras <[EMAIL PROTECTED]> wrote: > http://en.wikipedia.org/wiki/Alphanumeric > > In case you don't click the above link: > > "Alphanumeric is a is portmanteau of alphabetic and numeric and is > used to describe the collection of Latin letters and Arabic digits > used by much of western society. There are either 36 (single case) or > 62 (case-sensitive) alphanumeric characters. The alphanumeric > character set consists of the numbers 0 to 9 and letters A to Z." > > I'd love for Cake to have a built-in validation rule that would > validate all Latin/Cyrillic/Asian/Arabic alphanumeric characters, but > that's pretty much impossible. Validating them all at the same time might be, but localizing a website to cater to different alphabets or languages is doable. When I have to validate against Spanish rules, I don't cower in fear of kanji or arabic, I simply validate alphanumeric characters for my language, such as á, é, ñ, etc. After all, \w in Perl (or the equivalent regex entity for a word) is localized, IIRC. Saludos Adriano --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "CakePHP" group. To post to this group, send email to cake-php@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/cake-php?hl=en -~--~~~~--~~--~--~---
Re: Validation - some things are more alphanumeric than others
http://en.wikipedia.org/wiki/Alphanumeric In case you don't click the above link: "Alphanumeric is a is portmanteau of alphabetic and numeric and is used to describe the collection of Latin letters and Arabic digits used by much of western society. There are either 36 (single case) or 62 (case-sensitive) alphanumeric characters. The alphanumeric character set consists of the numbers 0 to 9 and letters A to Z." I'd love for Cake to have a built-in validation rule that would validate all Latin/Cyrillic/Asian/Arabic alphanumeric characters, but that's pretty much impossible. -Joel. On May 30, 12:56 pm, leo <[EMAIL PROTECTED]> wrote: > I will on Monday - it's seven pm and time to go shout at the kids. > > On 30 Maig, 18:55, "Chris Hartjes" <[EMAIL PROTECTED]> wrote: > > > On Fri, May 30, 2008 at 12:53 PM, leo <[EMAIL PROTECTED]> wrote: > > > > Letters like ñ & ç fail the following validation: > > > > 'rule' => 'alphaNumeric' > > > > I would say that's a bug, but my head still hasn't grown back from the > > > last time I said that, so I won't. > > > That is probably because the regex that checks for alphaNumeric might > > not be UTF-8 friendly. Don't panic, just file a ticket while your > > head regrows. > > > -- > > Chris Hartjes > > Internet Loudmouth > > Motto for 2008: "Moving from herding elephants to handling snakes..." > > @TheKeyBoard:http://www.littlehart.net/atthekeyboard --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "CakePHP" group. To post to this group, send email to cake-php@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/cake-php?hl=en -~--~~~~--~~--~--~---
Re: Validation - some things are more alphanumeric than others
I will on Monday - it's seven pm and time to go shout at the kids. On 30 Maig, 18:55, "Chris Hartjes" <[EMAIL PROTECTED]> wrote: > On Fri, May 30, 2008 at 12:53 PM, leo <[EMAIL PROTECTED]> wrote: > > > Letters like ñ & ç fail the following validation: > > > 'rule' => 'alphaNumeric' > > > I would say that's a bug, but my head still hasn't grown back from the > > last time I said that, so I won't. > > That is probably because the regex that checks for alphaNumeric might > not be UTF-8 friendly. Don't panic, just file a ticket while your > head regrows. > > -- > Chris Hartjes > Internet Loudmouth > Motto for 2008: "Moving from herding elephants to handling snakes..." > @TheKeyBoard:http://www.littlehart.net/atthekeyboard --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "CakePHP" group. To post to this group, send email to cake-php@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/cake-php?hl=en -~--~~~~--~~--~--~---
Re: Validation - some things are more alphanumeric than others
On Fri, May 30, 2008 at 12:53 PM, leo <[EMAIL PROTECTED]> wrote: > > Letters like ñ & ç fail the following validation: > > 'rule' => 'alphaNumeric' > > I would say that's a bug, but my head still hasn't grown back from the > last time I said that, so I won't. That is probably because the regex that checks for alphaNumeric might not be UTF-8 friendly. Don't panic, just file a ticket while your head regrows. -- Chris Hartjes Internet Loudmouth Motto for 2008: "Moving from herding elephants to handling snakes..." @TheKeyBoard: http://www.littlehart.net/atthekeyboard --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "CakePHP" group. To post to this group, send email to cake-php@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/cake-php?hl=en -~--~~~~--~~--~--~---
Validation - some things are more alphanumeric than others
Letters like ñ & ç fail the following validation: 'rule' => 'alphaNumeric' I would say that's a bug, but my head still hasn't grown back from the last time I said that, so I won't. --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "CakePHP" group. To post to this group, send email to cake-php@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/cake-php?hl=en -~--~~~~--~~--~--~---
Validation - some things are more alphanumeric than others
Letters like ñ & ç fail the following validation: 'rule' => 'alphaNumeric' I would say that's a bug, but my head still hasn't grown back from the last time I said that, so I won't. --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "CakePHP" group. To post to this group, send email to cake-php@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/cake-php?hl=en -~--~~~~--~~--~--~---