Re: Validation - some things are more alphanumeric than others

2008-06-02 Thread leo

My head has now grown back up to the ears.
--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"CakePHP" group.
To post to this group, send email to cake-php@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/cake-php?hl=en
-~--~~~~--~~--~--~---



Re: Validation - some things are more alphanumeric than others

2008-06-02 Thread Joel Perras

Very true.  I'll file a ticket today.

-Joel.

On Jun 2, 10:54 am, "Chris Hartjes" <[EMAIL PROTECTED]> wrote:
> On Mon, Jun 2, 2008 at 10:40 AM, leo <[EMAIL PROTECTED]> wrote:
>
> > I only meant, but didn't explain very well (at all) that it is no
> > longer good enough to assume that the whole world speaks American
> > English. While there is a good case for English-based programming
> > languages, programming is about end-users. They don't know and don't
> > want to know about limitations of 8 bits or ASCII's definition of the
> > alphabet, nor do clients or project managers. They just want to be
> > able to read things in their own language. We're getting there slowly.
>
> Programming is also about the people who write it as well, and I'd be
> willing to bet that this thread will be ignored if there is no ticket
> filed.
>
> --
> Chris Hartjes
> Internet Loudmouth
> Motto for 2008: "Moving from herding elephants to handling snakes..."
> @TheKeyBoard:http://www.littlehart.net/atthekeyboard
--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"CakePHP" group.
To post to this group, send email to cake-php@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/cake-php?hl=en
-~--~~~~--~~--~--~---



Re: Validation - some things are more alphanumeric than others

2008-06-02 Thread Chris Hartjes

On Mon, Jun 2, 2008 at 10:40 AM, leo <[EMAIL PROTECTED]> wrote:
>
> I only meant, but didn't explain very well (at all) that it is no
> longer good enough to assume that the whole world speaks American
> English. While there is a good case for English-based programming
> languages, programming is about end-users. They don't know and don't
> want to know about limitations of 8 bits or ASCII's definition of the
> alphabet, nor do clients or project managers. They just want to be
> able to read things in their own language. We're getting there slowly.

Programming is also about the people who write it as well, and I'd be
willing to bet that this thread will be ignored if there is no ticket
filed.

-- 
Chris Hartjes
Internet Loudmouth
Motto for 2008: "Moving from herding elephants to handling snakes..."
@TheKeyBoard: http://www.littlehart.net/atthekeyboard

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"CakePHP" group.
To post to this group, send email to cake-php@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/cake-php?hl=en
-~--~~~~--~~--~--~---



Re: Validation - some things are more alphanumeric than others

2008-06-02 Thread leo

Thanks for your research!

I only meant, but didn't explain very well (at all) that it is no
longer good enough to assume that the whole world speaks American
English. While there is a good case for English-based programming
languages, programming is about end-users. They don't know and don't
want to know about limitations of 8 bits or ASCII's definition of the
alphabet, nor do clients or project managers. They just want to be
able to read things in their own language. We're getting there slowly.

I was actually surprised to find that English isn't the most spoken
language. But my intended point was it is only one of three major
languages. I am British and I live and work in Spain, producing
multilingual websites in Catalan, Spanish, English and, occasionally,
German and French.

On 2 Juny, 16:13, Joel Perras <[EMAIL PROTECTED]> wrote:
> Ok, I'm a big enough man to admit when I'm wrong.
> After reading the POSIX specification for alphanumeric characters and
> realizing that it accomodates for different character set collations,
> I concede that you are correct in stating that 'alphanumeric' should
> apply to accented characters.
>
> However, the fact that English is not the most spoken language in the
> world is a vacuous argument; by your reasoning, programming languages
> should be written and interpreted in Chinese and Spanish instead of
> English.  I have yet to find a compiled or interpreted Chinese
> programming language, and I don't think they exist AFAIK.
> English is ubiquitous in the scientific (and software) community, and
> there's not much we can do to change this (my first language is
> French), nor would I want to.
>
> Anyways, here's what I've found:
>
> After doing a bit of research, I believe the solution to your problem
> is this regular expression:  /[^[:alnum:]]/u
>
> Ex:
>
> //1st element will pass, 2nd ?, 3rd will fail
> $data = array("asdf1", "çñasd45", "@#%asd23");
>
> foreach ($data as $str):
>         if (!preg_match("/[^[:alnum:]]/u", $str)) echo $str . " is
> valid alphanumeric. \n\n"';
>         else echo $str . " IS NOT VALID\n\n";
> endforeach;
>
> print_r($data);
>
> //Output:
>
> asdf1 is valid alphanumeric.
>
> çñasd45 is valid alphanumeric.
>
> @[EMAIL PROTECTED] IS NOT VALID
>
> Array
> (
>     [0] => asdf1
>     [1] => çñasd45
>     [2] => @#%asd23
> )
>
> On Jun 2, 4:56 am, leo <[EMAIL PROTECTED]> wrote:
>
> > I did click the link and I followed the references. It would seem that
> > a letter like ñ or ç is a letter embellished with a diacritic.
> > Therefore the letter is valid and the diacritic should be ignored. Ñ
> > is alpha.
>
> > From the same source, Spanish is spoken as a first language by between
> > 322 and 400 million people. English by 375 million. The English
> > speaking population of the USA is 215 million. Furthermore, Spanish is
> > a Latin language and the ASCII definition is based on the Latin
> > alphabet.
>
> > > >http://en.wikipedia.org/wiki/Alphanumeric
>
> > Things have moved on from the World as defined by IBM et al
> > (thankfully), but unfortunately the World has now become defined, in
> > the eyes of many, by Wikipaedia. Wikipaedia is a useful tool used in
> > conjunction with others, but dangerous on its own.
--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"CakePHP" group.
To post to this group, send email to cake-php@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/cake-php?hl=en
-~--~~~~--~~--~--~---



Re: Validation - some things are more alphanumeric than others

2008-06-02 Thread Joel Perras

Ok, I'm a big enough man to admit when I'm wrong.
After reading the POSIX specification for alphanumeric characters and
realizing that it accomodates for different character set collations,
I concede that you are correct in stating that 'alphanumeric' should
apply to accented characters.

However, the fact that English is not the most spoken language in the
world is a vacuous argument; by your reasoning, programming languages
should be written and interpreted in Chinese and Spanish instead of
English.  I have yet to find a compiled or interpreted Chinese
programming language, and I don't think they exist AFAIK.
English is ubiquitous in the scientific (and software) community, and
there's not much we can do to change this (my first language is
French), nor would I want to.


Anyways, here's what I've found:

After doing a bit of research, I believe the solution to your problem
is this regular expression:  /[^[:alnum:]]/u

Ex:

//1st element will pass, 2nd ?, 3rd will fail
$data = array("asdf1", "çñasd45", "@#%asd23");

foreach ($data as $str):
if (!preg_match("/[^[:alnum:]]/u", $str)) echo $str . " is
valid alphanumeric. \n\n"';
else echo $str . " IS NOT VALID\n\n";
endforeach;

print_r($data);

//Output:

asdf1 is valid alphanumeric.

çñasd45 is valid alphanumeric.

@[EMAIL PROTECTED] IS NOT VALID

Array
(
[0] => asdf1
[1] => çñasd45
[2] => @#%asd23
)


On Jun 2, 4:56 am, leo <[EMAIL PROTECTED]> wrote:
> I did click the link and I followed the references. It would seem that
> a letter like ñ or ç is a letter embellished with a diacritic.
> Therefore the letter is valid and the diacritic should be ignored. Ñ
> is alpha.
>
> From the same source, Spanish is spoken as a first language by between
> 322 and 400 million people. English by 375 million. The English
> speaking population of the USA is 215 million. Furthermore, Spanish is
> a Latin language and the ASCII definition is based on the Latin
> alphabet.
>
> > >http://en.wikipedia.org/wiki/Alphanumeric
>
> Things have moved on from the World as defined by IBM et al
> (thankfully), but unfortunately the World has now become defined, in
> the eyes of many, by Wikipaedia. Wikipaedia is a useful tool used in
> conjunction with others, but dangerous on its own.
--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"CakePHP" group.
To post to this group, send email to cake-php@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/cake-php?hl=en
-~--~~~~--~~--~--~---



Re: Validation - some things are more alphanumeric than others

2008-06-02 Thread leo

I did click the link and I followed the references. It would seem that
a letter like ñ or ç is a letter embellished with a diacritic.
Therefore the letter is valid and the diacritic should be ignored. Ñ
is alpha.

From the same source, Spanish is spoken as a first language by between
322 and 400 million people. English by 375 million. The English
speaking population of the USA is 215 million. Furthermore, Spanish is
a Latin language and the ASCII definition is based on the Latin
alphabet.

> >http://en.wikipedia.org/wiki/Alphanumeric

Things have moved on from the World as defined by IBM et al
(thankfully), but unfortunately the World has now become defined, in
the eyes of many, by Wikipaedia. Wikipaedia is a useful tool used in
conjunction with others, but dangerous on its own.
--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"CakePHP" group.
To post to this group, send email to cake-php@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/cake-php?hl=en
-~--~~~~--~~--~--~---



Re: Validation - some things are more alphanumeric than others

2008-05-30 Thread Joel Perras

I agree, and I do this myself in dual-language (English/French)
localised sites.

However, I was simply attempting to illustrate to the original poster
that the 'alphanumeric' rule does exactly what it is suppsed to do, as
per the definition of 'alphanumeric'.  The regex for multibyte
alphanumeric characters is (I believe) simple enough, but you need to
ensure that the characters are UTF-8 encoded.  If you don't know the
encoding, then I have no idea how you would be able to perform the
validation rule (and if this is something that someone has done
before, I will be forever grateful if you explain how to do it), and
hence my comment on the 'impossibility' of a universal alphanumeric
validation rule.  Cake is a wonderful tool, but sometimes you need to
get your hands dirty.

If my regex-fu is still up to par, I believe /[\w\pL]/u to be the
correct multibyte regex to match alphanumeric characters.  If this is
false, please correct me.

-Joel.

On May 30, 2:13 pm, Adriano Varoli Piazza <[EMAIL PROTECTED]> wrote:
> On May 30, 2:25 pm, Joel Perras <[EMAIL PROTECTED]> wrote:
>
> >http://en.wikipedia.org/wiki/Alphanumeric
>
> > In case you don't click the above link:
>
> > "Alphanumeric is a is portmanteau of alphabetic and numeric and is
> > used to describe the collection of Latin letters and Arabic digits
> > used by much of western society. There are either 36 (single case) or
> > 62 (case-sensitive) alphanumeric characters. The alphanumeric
> > character set consists of the numbers 0 to 9 and letters A to Z."
>
> > I'd love for Cake to have a built-in validation rule that would
> > validate all Latin/Cyrillic/Asian/Arabic alphanumeric characters, but
> > that's pretty much impossible.
>
> Validating them all at the same time might be, but localizing a
> website to cater to different alphabets or languages is doable. When I
> have to validate against Spanish rules, I don't cower in fear of kanji
> or arabic, I simply validate alphanumeric characters for my language,
> such as á, é, ñ, etc. After all, \w in Perl (or the equivalent regex
> entity for a word) is localized, IIRC.
>
> Saludos
> Adriano
--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"CakePHP" group.
To post to this group, send email to cake-php@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/cake-php?hl=en
-~--~~~~--~~--~--~---



Re: Validation - some things are more alphanumeric than others

2008-05-30 Thread Adriano Varoli Piazza

On May 30, 2:25 pm, Joel Perras <[EMAIL PROTECTED]> wrote:
> http://en.wikipedia.org/wiki/Alphanumeric
>
> In case you don't click the above link:
>
> "Alphanumeric is a is portmanteau of alphabetic and numeric and is
> used to describe the collection of Latin letters and Arabic digits
> used by much of western society. There are either 36 (single case) or
> 62 (case-sensitive) alphanumeric characters. The alphanumeric
> character set consists of the numbers 0 to 9 and letters A to Z."
>
> I'd love for Cake to have a built-in validation rule that would
> validate all Latin/Cyrillic/Asian/Arabic alphanumeric characters, but
> that's pretty much impossible.

Validating them all at the same time might be, but localizing a
website to cater to different alphabets or languages is doable. When I
have to validate against Spanish rules, I don't cower in fear of kanji
or arabic, I simply validate alphanumeric characters for my language,
such as á, é, ñ, etc. After all, \w in Perl (or the equivalent regex
entity for a word) is localized, IIRC.

Saludos
Adriano

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"CakePHP" group.
To post to this group, send email to cake-php@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/cake-php?hl=en
-~--~~~~--~~--~--~---



Re: Validation - some things are more alphanumeric than others

2008-05-30 Thread Joel Perras

http://en.wikipedia.org/wiki/Alphanumeric

In case you don't click the above link:

"Alphanumeric is a is portmanteau of alphabetic and numeric and is
used to describe the collection of Latin letters and Arabic digits
used by much of western society. There are either 36 (single case) or
62 (case-sensitive) alphanumeric characters. The alphanumeric
character set consists of the numbers 0 to 9 and letters A to Z."

I'd love for Cake to have a built-in validation rule that would
validate all Latin/Cyrillic/Asian/Arabic alphanumeric characters, but
that's pretty much impossible.

-Joel.

On May 30, 12:56 pm, leo <[EMAIL PROTECTED]> wrote:
> I will on Monday - it's seven pm and time to go shout at the kids.
>
> On 30 Maig, 18:55, "Chris Hartjes" <[EMAIL PROTECTED]> wrote:
>
> > On Fri, May 30, 2008 at 12:53 PM, leo <[EMAIL PROTECTED]> wrote:
>
> > > Letters like ñ & ç fail the following validation:
>
> > > 'rule' => 'alphaNumeric'
>
> > > I would say that's a bug, but my head still hasn't grown back from the
> > > last time I said that, so I won't.
>
> > That is probably because the regex that checks for alphaNumeric might
> > not be UTF-8 friendly.  Don't panic, just file a ticket while your
> > head regrows.
>
> > --
> > Chris Hartjes
> > Internet Loudmouth
> > Motto for 2008: "Moving from herding elephants to handling snakes..."
> > @TheKeyBoard:http://www.littlehart.net/atthekeyboard
--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"CakePHP" group.
To post to this group, send email to cake-php@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/cake-php?hl=en
-~--~~~~--~~--~--~---



Re: Validation - some things are more alphanumeric than others

2008-05-30 Thread leo

I will on Monday - it's seven pm and time to go shout at the kids.

On 30 Maig, 18:55, "Chris Hartjes" <[EMAIL PROTECTED]> wrote:
> On Fri, May 30, 2008 at 12:53 PM, leo <[EMAIL PROTECTED]> wrote:
>
> > Letters like ñ & ç fail the following validation:
>
> > 'rule' => 'alphaNumeric'
>
> > I would say that's a bug, but my head still hasn't grown back from the
> > last time I said that, so I won't.
>
> That is probably because the regex that checks for alphaNumeric might
> not be UTF-8 friendly.  Don't panic, just file a ticket while your
> head regrows.
>
> --
> Chris Hartjes
> Internet Loudmouth
> Motto for 2008: "Moving from herding elephants to handling snakes..."
> @TheKeyBoard:http://www.littlehart.net/atthekeyboard
--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"CakePHP" group.
To post to this group, send email to cake-php@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/cake-php?hl=en
-~--~~~~--~~--~--~---



Re: Validation - some things are more alphanumeric than others

2008-05-30 Thread Chris Hartjes
On Fri, May 30, 2008 at 12:53 PM, leo <[EMAIL PROTECTED]> wrote:
>
> Letters like ñ & ç fail the following validation:
>
> 'rule' => 'alphaNumeric'
>
> I would say that's a bug, but my head still hasn't grown back from the
> last time I said that, so I won't.

That is probably because the regex that checks for alphaNumeric might
not be UTF-8 friendly.  Don't panic, just file a ticket while your
head regrows.

-- 
Chris Hartjes
Internet Loudmouth
Motto for 2008: "Moving from herding elephants to handling snakes..."
@TheKeyBoard: http://www.littlehart.net/atthekeyboard

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"CakePHP" group.
To post to this group, send email to cake-php@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/cake-php?hl=en
-~--~~~~--~~--~--~---



Validation - some things are more alphanumeric than others

2008-05-30 Thread leo

Letters like ñ & ç fail the following validation:

'rule' => 'alphaNumeric'

I would say that's a bug, but my head still hasn't grown back from the
last time I said that, so I won't.

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"CakePHP" group.
To post to this group, send email to cake-php@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/cake-php?hl=en
-~--~~~~--~~--~--~---



Validation - some things are more alphanumeric than others

2008-05-30 Thread leo

Letters like ñ & ç fail the following validation:

'rule' => 'alphaNumeric'

I would say that's a bug, but my head still hasn't grown back from the
last time I said that, so I won't.

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"CakePHP" group.
To post to this group, send email to cake-php@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/cake-php?hl=en
-~--~~~~--~~--~--~---