php-general Digest 22 Apr 2006 12:29:57 -0000 Issue 4086

Topics (messages 234540 through 234557):

Re: Passing Form As Argument
        234540 by: Nicolas Verhaeghe
        234546 by: Richard Lynch

Handling illegal byte sequences in UTF-8 strings
        234541 by: Matt Arnilo S. Baluyos (Mailing Lists)
        234544 by: Jochem Maas
        234545 by: Richard Lynch

Re: Creating an OO Shopping Cart
        234542 by: Steve

Re: unexpected T_NEW on object property
        234543 by: Jochem Maas

Re: Preg_match() regex
        234547 by: Kevin Waterson

cURL & cookies
        234548 by: Peter Hoskin

sorting troubles
        234549 by: William Stokes
        234550 by: Peter Hoskin
        234551 by: Peter Hoskin
        234553 by: Paul Novitski
        234555 by: Paul Novitski

unique array problem
        234552 by: suresh kumar
        234554 by: Paul Novitski

array problem
        234556 by: suresh kumar
        234557 by: Brian V Bonini

Administrivia:

To subscribe to the digest, e-mail:
        [EMAIL PROTECTED]

To unsubscribe from the digest, e-mail:
        [EMAIL PROTECTED]

To post to the list, e-mail:
        php-general@lists.php.net


----------------------------------------------------------------------
--- Begin Message ---


On Fri, April 21, 2006 4:56 pm, Nicolas Verhaeghe wrote:
> I have functions which dynamically generate client-side javascript
> validation functions according to the name of the field, its type 
> (text, password, email, drop down, radio button, textarea, and what 
> not).
>
> Same thing server-side.

Allow me to expand on why I think this is (generally) a wrong-headed
approach.

Consider a simple, common example:  The phone number.

Now, if you're doing this the Right Way and restricting only to the
characters known to be valid, then you want only: [0-9]

To be nice to users, maybe you allow '-' and space as well.

Of course, if it's taking international phone numbers, you want to let them
type that leading + sign, but not if it's US-only.

Now, if it's a businees-oriented phone number, you want to allow something
like: 1-800-CALL-ATT because, by god, they paid big money to get the digits
they want and the right to promote/market that 800 number with
alpha-characters in it.

Yet, to be as restrictive as possible for non-business use with home
telephone numbers, you wouldn't want to let that slip by, so you can avoid
more pranksters.

If you look at it carefully, most of your data in most of your applications
*IS* that complicated.

Phone numbers?  See above.

Postal Codes?  US or World?  Zip +4 or not?  Should you not cross-check with
country code and a specific regex, for those countries where you KNOW what
it should be, and you expect many users?

Email address?  Man, you could spend a year trying to get this one right,
and still have it wrong.

So, all-in-all, the "rule" for how to sanitize data, IN MY OPINION, is too
application-specific and too domain-specific to be generalized and maintain
the level of security most programmers and clients would desire, given the
cost/benifit ratios involved for using a pre-packaged sanitizer, or a clear
in-line regex of what is kosher for THIS application and THIS domain.

To drive this home:  If the rule is complicated enough to want a generalized
function to handle it, it's probably complicated enough that you do NOT want
to over-generalize by using a package function, but want to use the RIGHT
regex for that application.

This is just my philosophical position, and I'm NOT the expert.

----------------------

You can always take it to the nth level and end up not verifying everything
but you can prevent most common mistakes.

As far as email address, make sure there is something that looks like
"[EMAIL PROTECTED]".

Same with Zip codes. You can CASS certify it all you want but you'll never
be sure that the address is correct until you send snail mail.

The idea of such client- and server-side verification is to prevent mistakes
that the user could make unwillingly, for instance mixing fields: typing
something else than the email address in that field, without realizing it.

You can force someone to enter data into a field that absolutely needs to be
filled.

So far, I have rarely seen people entering fake data into shopping carts or
online forms. Why? Because most people don't have time to waste screwing
around filling online form with junk.

--- End Message ---
--- Begin Message ---
On Fri, April 21, 2006 7:09 pm, Nicolas Verhaeghe wrote:
> So far, I have rarely seen people entering fake data into shopping
> carts or
> online forms. Why? Because most people don't have time to waste
> screwing
> around filling online form with junk.

You have been very very very lucky, then.

Because there are a zillion bots out there making all kinds of crazy
POSTs to everybody's forms, trying to abuse our FORMs to:
  send junk email
  post links to on-line casinos and other regionally illegal e-ventures
  post links to pay-per-view and pay-per-click affiliate sites

Those CAPTCHA thingies (where you have to type the letters) are not
just for fun.

It's only a matter of time before CAPTCHA is useless.

Data validation and sanitzation is not just to stop the Good Guys who
make typos, but also the Bad Guys who are attempting to abuse your
site.

-- 
Like Music?
http://l-i-e.com/artists.htm

--- End Message ---
--- Begin Message ---
Hello list,

We have recently upgraded our database to PostgreSQL 8.1.x which
handles UTF-8 more strictly than previous versions. The new version
will not allow illegal byte sequences when inserting data.

This has caused some errors in our system which inputs data.
Basically, what the system does is insert data which is copy-pasted
from OpenOffice.org files. The content of the OpenOffice.org files are
likewise pasted from various websites which may or may not be using
UTF-8 encoding.

After some research, I have looked at both iconv and mbstring (I might
use iconv since it's there by default). But nonetheless, someone on
the list may have a better way of handling this issue.

What then would be the best way to handle illegal byte sequences
before they are inserted into the database?


--
Stand before it and there is no beginning.
Follow it and there is no end.
Stay with the ancient Tao,
Move with the present.

--- End Message ---
--- Begin Message ---
Matt Arnilo S. Baluyos (Mailing Lists) wrote:
Hello list,

We have recently upgraded our database to PostgreSQL 8.1.x which
handles UTF-8 more strictly than previous versions. The new version
will not allow illegal byte sequences when inserting data.

This has caused some errors in our system which inputs data.
Basically, what the system does is insert data which is copy-pasted
from OpenOffice.org files. The content of the OpenOffice.org files are
likewise pasted from various websites which may or may not be using
UTF-8 encoding.

After some research, I have looked at both iconv and mbstring (I might
use iconv since it's there by default). But nonetheless, someone on
the list may have a better way of handling this issue.

What then would be the best way to handle illegal byte sequences
before they are inserted into the database?

the best? wait for php6. but thats probably not an option.
for the rest I'm a charset idiot (I just proved it with a nightmare
upgrade to mysql4.1.something)



--
Stand before it and there is no beginning.
Follow it and there is no end.
Stay with the ancient Tao,
Move with the present.


--- End Message ---
--- Begin Message ---
On Fri, April 21, 2006 7:16 pm, Matt Arnilo S. Baluyos (Mailing Lists)
wrote:
> We have recently upgraded our database to PostgreSQL 8.1.x which
> handles UTF-8 more strictly than previous versions. The new version
> will not allow illegal byte sequences when inserting data.
>
> This has caused some errors in our system which inputs data.
> Basically, what the system does is insert data which is copy-pasted
> from OpenOffice.org files. The content of the OpenOffice.org files are
> likewise pasted from various websites which may or may not be using
> UTF-8 encoding.
>
> After some research, I have looked at both iconv and mbstring (I might
> use iconv since it's there by default). But nonetheless, someone on
> the list may have a better way of handling this issue.
>
> What then would be the best way to handle illegal byte sequences
> before they are inserted into the database?

I guess the big question would be this:

Where do you intend to output these strings?

Are they going to end up in UTF-8 HTML output?

Or are they going to end up in Unicode (UTF-16+) documents?

Or are you stuck with Latin-1 HTML output for legacy reasons?

Going at it from the other side...

A *LOT* of MS Office (Word) users will end up copying and pasting
stuff that just plain is NOT any kind of standard at all.

They're internal Word formatted characters that have no meaning
whatsoever in any world other than MS Word.

I suspect OpenOffice *might* be acting Word-compatible in this regard.

If you've got THOSE coming in, there are some functions in the User
Contributed Notes of str_replace that will let you convert funky crap
MS Word only characters into their closest moral equivalent HTML
Entity.

Of, if they ARE supposed to be valid UTF-8 characters, but there's a
bug in OpenOffice, well, obviously, you need a work-around TODAY, but
file a bug report too, so it can be fixed for tomorrow.

I doubt that anybody can really advise you without seeing the actual
characters (byte for byte) you are receiving.

And you may want to compare what the user is seeing in OpenOffice with
what you are getting and what output you want -- Because until you've
defined what they "see", what they give you, and what you want, you're
pretty much just guessing in the dark what you want to do.

-- 
Like Music?
http://l-i-e.com/artists.htm

--- End Message ---
--- Begin Message --- Richard... you're amazing. Good on you for just standing up there, stating your position and defending it like there's no tomorrow!

So everyone's aware, I have NO intention of storing credit card #'s. I don't see why anyone needs to.. especially after reading Richard's past posts in the archive.

Steve

--- End Message ---
--- Begin Message ---
M. Sokolewicz wrote:
Jochem Maas wrote:

Paul Barry wrote:

..


Then I have another class:

<?php
require_once('model/Address.class.php');
class User {
    public $name;
    public $address = new Address();



this is wrong. you can define the property in the class
with a constant or scalar value (i.e. literal string,
numeric value or an array) but not a return value of a
function or a 'new' object.

just to nag, an array is not a scalar value. However, you're correct on this. Properties can only be defined in the class with constant values (this does not mean they have to be constants! The values they get just have to be fixed, and not determined during runtime.)

IC - spot the self taught idiot :-) (that's me btw)

--- End Message ---
--- Begin Message ---
This one time, at band camp, "Jeff" <[EMAIL PROTECTED]> wrote:

> Hey all,
> 
> Regex pattern question here.  I need to match on "Foo-F00", "Foo-foo",
> "foo-Foo".  I know in perl you can use the /i to specify "case
> insensitive" matching.  Is there any such switch that can be used in
> preg_match() in PHP?

Yes, the same works in PCRE. so you can do
preg_match("/foo-foo/i", $string)

kevin


-- 
"Democracy is two wolves and a lamb voting on what to have for lunch. 
Liberty is a well-armed lamb contesting the vote."

--- End Message ---
--- Begin Message ---
Hi,

I'm trying to produce an sms sending script, however having problems
with curl and storing cookies. The login page works fine, however the
second http request returns a login page rather than authenticated
content. Additionally, in the headers a different cookie value for
JSESSIONID is set.

I'm running PHP/5.1.2 on FreeBSD

This script:
define('COOKIEJAR','/path/to/curl-cookiejar');
define('USERAGENT','Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)');


$ch = curl_init();
curl_setopt($ch, CURLOPT_COOKIEJAR, COOKIEJAR);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 0);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_FAILONERROR, 1);
curl_setopt($ch, CURLOPT_HEADER, 1);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_USERAGENT, USERAGENT);
curl_setopt($ch, CURLOPT_URL,
"https://www.domain.com/cocoon/nonav/secureLogin/login.xml";);
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_POSTFIELDS, array2urlstring($logindata));
curl_setopt($ch, CURLOPT_NOBODY, 1);
$return['login'] = curl_exec($ch);
curl_close($ch);
unset($ch);

$ch = curl_init();
curl_setopt($ch, CURLOPT_COOKIEFILE, COOKIEJAR);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 0);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_FAILONERROR, 1);
curl_setopt($ch, CURLOPT_HEADER, 1);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_USERAGENT, USERAGENT);
curl_setopt($ch, CURLOPT_NOBODY, 0);
curl_setopt($ch, CURLOPT_URL,
"https://www.domain.com/cocoon/websms/websms.xml";);
$return['sms'] = curl_exec($ch);
curl_close($ch);
unset($ch);

$fp = fopen(COOKIEJAR,'w');
fclose($fp);

echo "<h1>Login</h1>\n";
echo "<pre>\n";
echo htmlspecialchars($return['login']);
echo "</pre>\n";

echo "<h1>SMS</h1>\n";
echo "<pre>\n";
echo $return['sms'];
echo "</pre>\n";

Produces the output:


  Login

HTTP/1.0 200 OK
Date: Sat, 22 Apr 2006 08:04:03 GMT
Server: Apache/1.3.33 (Unix) mod_jk/1.2.15 mod_perl/1.29 mod_ssl/2.8.22
OpenSSL/0.9.7e
Set-Cookie: JSESSIONID=FC9D098E63E5A065AC8934B7F7BB605A.zooapp02b;
Path=/cocoon; Secure
Set-Cookie: user=; Expires=Thu, 01-Jan-1970 00:00:10 GMT; Path=/
X-Cocoon-Version: 2.1.5.1
Connection: close
Content-Type: text/html



  SMS

HTTP/1.1 200 OK
Transfer-Encoding: chunked
Date: Sat, 22 Apr 2006 08:04:03 GMT
Content-Type: text/html
Connection: close
Set-Cookie: AlteonP=c3b503cb52f50428; path=/
Server: Apache/1.3.33 (Unix) mod_jk/1.2.15 mod_perl/1.29 mod_ssl/2.8.22
OpenSSL/0.9.7e
Set-Cookie: JSESSIONID=56D23CCA63E2DDE37293FB64647027D8.zooapp01b;
Path=/cocoon
Set-Cookie: user=; Expires=Thu, 01-Jan-1970 00:00:10 GMT; Path=/
X-Cocoon-Version: 2.1.5.1
Via: 1.1 nsw-cache1 (NetCache NetApp/6.0.2)

--- End Message ---
--- Begin Message ---
Hello,

Any idea how to sort this?

I have a column in DB that contains this kind of data, 
A20,B16,B17C14,C15,D13,D12 etc.

I would like to print this data to a page and sort it ascending by the 
letter an descending by the number. Can this be done? Like

A20
B17
B16
C15
C14
D13
D12

Thanks
-Will 

--- End Message ---
--- Begin Message ---
See explode, http://www.php.net/explode

$var = 'A20,B16,B17C14,C15,D13,D12';
$array = explode(',',$var);

foreach ($array as $key => $value) {
  echo $value ."\n";
}


William Stokes wrote:
> Hello,
>
> Any idea how to sort this?
>
> I have a column in DB that contains this kind of data, 
> A20,B16,B17C14,C15,D13,D12 etc.
>
> I would like to print this data to a page and sort it ascending by the 
> letter an descending by the number. Can this be done? Like
>
> A20
> B17
> B16
> C15
> C14
> D13
> D12
>
> Thanks
> -Will 
>
>   

--- End Message ---
--- Begin Message ---
hmm, should also see http://www.php.net/sort

Peter Hoskin wrote:
> See explode, http://www.php.net/explode
>
> $var = 'A20,B16,B17C14,C15,D13,D12';
> $array = explode(',',$var);
>
> foreach ($array as $key => $value) {
>   echo $value ."\n";
> }
>
>
> William Stokes wrote:
>   
>> Hello,
>>
>> Any idea how to sort this?
>>
>> I have a column in DB that contains this kind of data, 
>> A20,B16,B17C14,C15,D13,D12 etc.
>>
>> I would like to print this data to a page and sort it ascending by the 
>> letter an descending by the number. Can this be done? Like
>>
>> A20
>> B17
>> B16
>> C15
>> C14
>> D13
>> D12
>>
>> Thanks
>> -Will 
>>
>>   
>>     
>
>   

--- End Message ---
--- Begin Message ---
At 02:49 AM 4/22/2006, William Stokes wrote:
I have a column in DB that contains this kind of data,
A20,B16,B17C14,C15,D13,D12 etc.

I would like to print this data to a page and sort it ascending by the
letter an descending by the number. Can this be done? Like

A20
B17
B16
C15
C14
D13
D12


Will, I can think of two ways to do this.

One way involves splitting each value into its two components so you can sort one ASC & the other DESC.

(First off, I'd seriously consider storing them in two separate DB fields. If you can do this, your problem is over because you can extract them in a query in just the order you want. If you're sorting them in opposite directions that's a pretty good clue that they're really separate data and should be stored separately.)

But you can split them on the fly in PHP. You can run preg_match() on each element of your array with a RegExp something like:

        /(\D)(\d+)/

(one non-numeric character followed by one or more numeric digits)

        preg_match("/(\D)(\d+)/", $aData($i), $aMatches)

is going to produce an array like:

        $aMatches = array(
                [0] = A20
                [1] = A
                [2] = 20
        )

Build a workspace array using those three match values, sort the way you want on [1] & [2], and extract your result array from [0].


An alternative way to do this, trickier but probably much faster, is to translate the initial letter into a value that's its inverse:

        $aFrom = array("A","B","C",..."Z");
        $aTo = array("Z","Y","X",..."A");

        $aTemp = str_replace($aFrom, $aTo, $aData);

This will translate the letters in all the elements of the data array into their inverse:

        B16 --> Y16
        A20 --> Z20
        B17 --> Y17
        ...

Then just do a reverse sort on $aTemp and you get:

        Z20
        Y17
        Y16

Then translate the letters back to their original values and you get:

        A20
        B17
        C16

Voila.

Paul
--- End Message ---
--- Begin Message ---
At 03:43 AM 4/22/2006, I wrote:
Then just do a reverse sort on $aTemp and you get:

        Z20
        Y17
        Y16

Then translate the letters back to their original values and you get:

        A20
        B17
        C16


Oops, I made a typo: that final value should have been B16, not C16.

(All the same, not too shabby for four in the morning...)

Paul
--- End Message ---
--- Begin Message ---
I am facing one project in my project .
   
  this is my  code:
   
  a=array(0=>10,1=>10,2=>20,3=>30,4=>30,5=>40);
  b=array();
  b=array_unique($a);
  print_r($b); 
  o/p  getting from above code is  b[0]=10,b[2]=20,b[3]=30,b[5]=40;
   
  but i want the o/p be b[0]=10,b[1]=20,b[2]=30,b[3]=40;
   
  i searched php.net .i am not able to fine any solution.i am breaking my head 
for last  5 hours.i am waiting reply from any one
   
   
   
   

                                
---------------------------------
 Jiyo cricket on Yahoo! India cricket
Yahoo! Messenger Mobile Stay in touch with your buddies all the time.

--- End Message ---
--- Begin Message ---
At 03:21 AM 4/22/2006, suresh kumar wrote:
I am facing one project in my project .

  this is my  code:

  a=array(0=>10,1=>10,2=>20,3=>30,4=>30,5=>40);
  b=array();
  b=array_unique($a);
  print_r($b);
  o/p  getting from above code is  b[0]=10,b[2]=20,b[3]=30,b[5]=40;

  but i want the o/p be b[0]=10,b[1]=20,b[2]=30,b[3]=40;

i searched php.net .i am not able to fine any solution.i am breaking my head for last 5 hours.i am waiting reply from any one


Suresh,

After you use array_unique() to remove duplicate values, can't you simply use sort() to impose consecutive keys?

http://php.net/sort :

"This function sorts an array. Elements will be arranged from lowest to highest when this function has completed.

"Note: This function assigns new keys for the elements in array. It will remove any existing keys you may have assigned, rather than just reordering the keys."

Paul
--- End Message ---
--- Begin Message ---
 sorry.earlier i mistyped some values. 
   
  I am facing one project in my project .
   
  this is my  code:
   
  a=array(0=>10,1=>10,2=>40,3=>30,4=>30,5=>10);
  b=array();
  b=array_unique($a);
  print_r($b); 
  o/p  getting from above code is  b[0]=10,b[2]=40,b[3]=30,b[5]=10;
   
  but i want the o/p be b[0]=10,b[1]=40,b[2]=30,b[3]=10;
   
  i searched php.net .i am not able to fine any solution.i am breaking 
my head for last  5 hours.i am waiting reply from any one
   


                                
---------------------------------
 Jiyo cricket on Yahoo! India cricket
Yahoo! Messenger Mobile Stay in touch with your buddies all the time.

--- End Message ---
--- Begin Message ---
On Sat, 2006-04-22 at 07:09, suresh kumar wrote:
>  sorry.earlier i mistyped some values. 
>    
>   I am facing one project in my project .
>    
>   this is my  code:
>    
>   a=array(0=>10,1=>10,2=>40,3=>30,4=>30,5=>10);
>   b=array();
>   b=array_unique($a);
>   print_r($b); 
>   o/p  getting from above code is  b[0]=10,b[2]=40,b[3]=30,b[5]=10;
>    
>   but i want the o/p be b[0]=10,b[1]=40,b[2]=30,b[3]=10;


That will return:

Array
(
    [0] => 10
    [2] => 40
    [3] => 30
)

If you want:

Array
(
    [0] => 10
    [1] => 40
    [2] => 30
)


Don't use an associative array for $a 
$a=array(10,10,40,30,30,10);

Or iterate through $a to re-sequence the index in $b.

$a=array(0=>10,1=>10,2=>40,3=>30,4=>30,5=>10);
$a=array_unique($a);

foreach($a as $v) {
        $b[] = $v;
}

print_r($b);

-- 

s/:-[(/]/:-)/g


Brian        GnuPG -> KeyID: 0x04A4F0DC | Key Server: pgp.mit.edu
======================================================================
gpg --keyserver pgp.mit.edu --recv-keys 04A4F0DC
Key Info: http://gfx-design.com/keys
Linux Registered User #339825 at http://counter.li.org

--- End Message ---

Reply via email to