Re: [PHP] Performance question for table updating

2007-11-27 Thread Andrew Ballard
On Nov 24, 2007 2:32 AM, Jon Westcot <[EMAIL PROTECTED]> wrote:
> Hi all:
>
> For those who've been following the saga, I'm working on an application 
> that needs to load
> a data file consisting of approximately 29,000 to 35,000 records in it (and 
> not short ones,
> either) into several tables.  I'm using MySQL as the database.
>
> I've noticed a really horrible performance difference between INSERTing 
> rows into the
> table and UPDATEing rows with new data when they already exist in the table.  
> For
> example, when I first start with an empty table, the application inserts 
> around 29,600
> records in something less than 6 minutes.  But, when I use the second file, 
> updating that
> same table takes over 90 minutes.
>
> Here's my question: I had assumed -- probably wrongly -- that it would be 
> far more
> expedient to only update rows where data had actually changed; moreover, that 
> I should
> only update the changed fields in the particular rows.  This involves a large 
> number of if
> statements, i.e.,
>
> if($old_row["field_a"] !== $new_row["field_66"] {
> $update_query .= "field_a = '" . 
> mysql_real_escape_string($new_row["field_66"]) . "',";
> }
>
> Eventually, I wind up with a query similar to:
>
> UPDATE table_01 SET field_a = 'New value here', updated=CURDATE() 
> WHERE
> primary_key=12345
>
> I thought that, to keep the table updating to a minimum, this approach 
> made the most
> sense.  However, seeing the two hugely different performance times has made 
> me question
> whether or not it would be faster to simply update every field in the table 
> and eliminate all of
> these test conditions.
>

I don't think indexes are really your trouble. Based on what you've
posted, it looks like you are selecting the entire data set from MySQL
back to PHP and then looping through each record to compare the new
and old values to build your queries. If that's the case, it will
almost assuredly be faster to just loop through the new data and
update everything (even if nothing has changed). Otherwise, you have
the added network traffic to send the entire result set back from
MySQL to PHP plus the per-row, per-column comparisons, string
concatenation operations, and finally another round trip across the
network to MySQL again to execute the query for each row in the data.
If you want to keep some of your optimizations, you can still loop
through the data in your file and build statements like this:

UPDATE table_01 SET
field_a = 'New value for a',
field_b = 'New value for b',
updated=CURDATE()
WHERE primary_key=12345
   AND NOT (
field_a = 'New value for a'
AND field_b = 'New value for b'
)

This way, if nothing in the record has changed, you still have a trip
to MySQL but MySQL has very little work to do since it will find the
record by the primary key, then see that no values have changed, and
update nothing. No indexes need updated and even your `updated` field
is not changed.

Andrew

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP] Performance question for table updating

2007-11-25 Thread Chris



Eventually, I wind up with a query similar to:

UPDATE table_01 SET field_a = 'New value here', updated=CURDATE() WHERE 
primary_key=12345


Even though you've solved it one way to work out the problem here would 
be to change it to a select query (unfortunately mysql can't explain 
insert/update/delete statements - which would make it a lot easier).


ie change it to

select 1 from table where primary_key=12345;

If that's slow, then you're missing an index on primary_key.

If that's fast.. err, you're back to square one :/ or maybe you have too 
many indexes (which means the update process is spending all of it's 
time updating the fields & indexes).


--
Postgresql & php tutorials
http://www.designmagick.com/

--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP] Performance question for table updating (SOLVED)

2007-11-24 Thread Robert Cummings
On Sat, 2007-11-24 at 04:03 -0700, Jon Westcot wrote:
> 
> Moral of the story?  Two, really.  First, ensure you always reference
> values in the way most appropriate for their type.  Second, don't make your
> idiocy public by asking stupid questions on a public forum.   What's the
> quote (probably attributed to Churchill)?  "It is better to be ignorant and
> silent than to voice one's opinions and remove all doubt." ;)

I don't think that phrase applies to your situation... I'd say you did
the right thing. Far too often people are afraid of reprisal or seeming
stupid and so a problem doesn't get solved until it's far too late and
its effects become magnified. Here's a more apt quote for your
situation:

"It's better to be an idiot for 10 minutes than a failure for
 the rest of your life."

-- Me :)

Cheers,
Rob.
-- 
...
SwarmBuy.com - http://www.swarmbuy.com

Leveraging the buying power of the masses!
...

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



RE: [PHP] Performance question for table updating (SOLVED)

2007-11-24 Thread Bastien Koert

Could there be some performance gain by uploading the data to another table and 
then update / insert via sql?

bastien


> From: [EMAIL PROTECTED]
> To: php-general@lists.php.net
> Date: Sat, 24 Nov 2007 04:03:53 -0700
> Subject: Re: [PHP] Performance question for table updating (SOLVED)
> 
> Hi Rob, et al.:
> 
> - Original Message -
> From: "Andrés Robinet" 
>>> -Original Message-
>>> From: Jon Westcot [mailto:[EMAIL PROTECTED]
>>>
>>> :: gigantic snip here::
>>>
>>> So, long story short (oops -- too late!), what's the concensus
>>> among the learned assembly here?  Is it faster to just UPDATE the
>>> record if it already exists regardless of the fact that maybe only one
>>> or two out of 75 or more fields changed versus testing each one of
>>> those 75 fields to try and figure out which ones actually changed and
>>> then only update those?
>>>
>>> I look forward to reading all of your thoughts.
>>>
>>> Sincerely,
>>>
>>> Jon
>>
>> I don't know about consensus over here because I'm kind of newgie (stands
>> for new geek, as opposed to newbie which stands for new ball breaker :D :D
>> ). I don't know of your previous messages but I can tell you one story...
>> Some time ago I got involved in a project that required geo-distance
>> calculation (you know distance between two points with latitude and
>> longitude). Basically I had to take a set of points and calculate the
>> distance of each of those points to a given (reference) one. The math was
>> something like the "square root of the sum of a constant times the square
>> sin of..." well, I can't remember it, but the point is, it was a
> complicated
>> formula, which I thought it would allow for some optimizations in PHP.
>> Accustomed to regular (compiled) programming languages I developed a set
> of
>> routines to optimize the task and went ahead and queried the database for
>> the (say, 1000 records) dataset of points. Then applied the math to the
>> points and the reference point and got the result... in about 5 minutes to
>> my (disgusting) surprise.
>> Then I grabbed the MySQL manual, built a "non-optimized" version of the
>> formula to put directly in the SQL query and get the "shortest distance"
>> (which was my goal in the end) calculated by MySQL right away. I thought
>> "ok, I'll prepare a cup of coffee to wait for MySQL to finish the
>> calculation". To my surprise the query returned the expected result in
> less
>> than 2 seconds.
>> My logic was (wrongly) the following: PHP is a programming language, SQL
> is
>> a data access language; I'll get the data using MySQL and do the math
> using
>> PHP. But I forgot PHP is an interpreted language, that a number is more
> than
>> a number to PHP, but a ZVAL_ object behind the scenes. I forgot
>> about the memory and the time required to build those objects when one
>> retrieves data out of a database server. I forgot about parsing time, and
>> "support logic and safety checks" in the language that overkill any
> attempt
>> to build TDCPL (Too Damn Complex Programming Logic) in PHP.
>> So, now, when I have to do some logic stuff to the retrieved data, I first
>> check "how much" I can push into the query itself, to get little or
> nothing
>> of programming logic in PHP after retrieving (before storing) the data.
>> All that said, I'd give a shot to the MySQL REPLACE function (I wouldn't
>> even branch the code to use INSERT or UPDATE depending on the record
> already
>> existing or not, If you have a primary key, all you need is REPLACE). But,
>> PLEASE LOOK AT THE GOTCHAS (like set col_name=col_name+1). Furthermore, If
>> those data files were to be uploaded by me (I mean, me, the coder, not the
>> end user), I'd build (use) a program to convert them to SQL sentences in
> my
>> desktop PC where I can use faster programming languages and I can wait for
>> five minutes of heavy processing (instead of overkilling the server for
> five
>> minutes which will slow down every other service in there).
>> In the end it depends on your requirements and where you get the data from
>> and if and how you want to automate the task (I didn't get your previous
>> messages, I got subscribed recently, if you can send me a link to those
>> ones... great!)
>>
>> Rob
> 
> Thanks for the comments and suggestions.  Pri

Re: [PHP] Performance question for table updating (SOLVED)

2007-11-24 Thread Jon Westcot
Hi Rob, et al.:

- Original Message -
From: "Andrés Robinet" <[EMAIL PROTECTED]>
> > -Original Message-
> > From: Jon Westcot [mailto:[EMAIL PROTECTED]
> >
> > :: gigantic snip here::
> >
> > So, long story short (oops -- too late!), what's the concensus
> > among the learned assembly here?  Is it faster to just UPDATE the
> > record if it already exists regardless of the fact that maybe only one
> > or two out of 75 or more fields changed versus testing each one of
> > those 75 fields to try and figure out which ones actually changed and
> > then only update those?
> >
> > I look forward to reading all of your thoughts.
> >
> > Sincerely,
> >
> > Jon
>
> I don't know about consensus over here because I'm kind of newgie (stands
> for new geek, as opposed to newbie which stands for new ball breaker :D :D
> ). I don't know of your previous messages but I can tell you one story...
> Some time ago I got involved in a project that required geo-distance
> calculation (you know distance between two points with latitude and
> longitude). Basically I had to take a set of points and calculate the
> distance of each of those points to a given (reference) one. The math was
> something like the "square root of the sum of a constant times the square
> sin of..." well, I can't remember it, but the point is, it was a
complicated
> formula, which I thought it would allow for some optimizations in PHP.
> Accustomed to regular (compiled) programming languages I developed a set
of
> routines to optimize the task and went ahead and queried the database for
> the (say, 1000 records) dataset of points. Then applied the math to the
> points and the reference point and got the result... in about 5 minutes to
> my (disgusting) surprise.
> Then I grabbed the MySQL manual, built a "non-optimized" version of the
> formula to put directly in the SQL query and get the "shortest distance"
> (which was my goal in the end) calculated by MySQL right away. I thought
> "ok, I'll prepare a cup of coffee to wait for MySQL to finish the
> calculation". To my surprise the query returned the expected result in
less
> than 2 seconds.
> My logic was (wrongly) the following: PHP is a programming language, SQL
is
> a data access language; I'll get the data using MySQL and do the math
using
> PHP. But I forgot PHP is an interpreted language, that a number is more
than
> a number to PHP, but a ZVAL_ object behind the scenes. I forgot
> about the memory and the time required to build those objects when one
> retrieves data out of a database server. I forgot about parsing time, and
> "support logic and safety checks" in the language that overkill any
attempt
> to build TDCPL (Too Damn Complex Programming Logic) in PHP.
> So, now, when I have to do some logic stuff to the retrieved data, I first
> check "how much" I can push into the query itself, to get little or
nothing
> of programming logic in PHP after retrieving (before storing) the data.
> All that said, I'd give a shot to the MySQL REPLACE function (I wouldn't
> even branch the code to use INSERT or UPDATE depending on the record
already
> existing or not, If you have a primary key, all you need is REPLACE). But,
> PLEASE LOOK AT THE GOTCHAS (like set col_name=col_name+1). Furthermore, If
> those data files were to be uploaded by me (I mean, me, the coder, not the
> end user), I'd build (use) a program to convert them to SQL sentences in
my
> desktop PC where I can use faster programming languages and I can wait for
> five minutes of heavy processing (instead of overkilling the server for
five
> minutes which will slow down every other service in there).
> In the end it depends on your requirements and where you get the data from
> and if and how you want to automate the task (I didn't get your previous
> messages, I got subscribed recently, if you can send me a link to those
> ones... great!)
>
> Rob

Thanks for the comments and suggestions.  Prior to receiving your note,
I went back and did a bit of checking on my code.  Turns out that the
problem was "hardware" related -- the infamous "Loose Screw Behind the
Keyboard."

The problem actually boiled down to two quotation marks -- they were
present in the search code to see if a record with the specified key
existed, but were omitted in the WHERE clause of the UPDATE statement.  Said
update therefore refused to use the nice little index I'd provided for its
use and instead scanned through the entire table to find the record in
question.

Moral of the story?  Two, really.  First, ensure you always reference
values in the way most appropriate for their type.  Second, don't make your
idiocy public by asking stupid questions on a public forum.   What's the
quote (probably attributed to Churchill)?  "It is better to be ignorant and
silent than to voice one's opinions and remove all doubt." ;)

Jon

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



RE: [PHP] Performance question for table updating

2007-11-24 Thread Andrés Robinet
> -Original Message-
> From: Jon Westcot [mailto:[EMAIL PROTECTED]
> Sent: Saturday, November 24, 2007 4:32 AM
> To: PHP General
> Subject: [PHP] Performance question for table updating
> 
> Hi all:
> 
> For those who've been following the saga, I'm working on an
> application that needs to load a data file consisting of approximately
> 29,000 to 35,000 records in it (and not short ones, either) into
> several tables.  I'm using MySQL as the database.
> 
> I've noticed a really horrible performance difference between
> INSERTing rows into the table and UPDATEing rows with new data when
> they already exist in the table.  For example, when I first start with
> an empty table, the application inserts around 29,600 records in
> something less than 6 minutes.  But, when I use the second file,
> updating that same table takes over 90 minutes.
> 
> Here's my question: I had assumed -- probably wrongly -- that it
> would be far more expedient to only update rows where data had actually
> changed; moreover, that I should only update the changed fields in the
> particular rows.  This involves a large number of if statements, i.e.,
> 
> if($old_row["field_a"] !== $new_row["field_66"] {
> $update_query .= "field_a = '" .
> mysql_real_escape_string($new_row["field_66"]) . "',";
> }
> 
> Eventually, I wind up with a query similar to:
> 
> UPDATE table_01 SET field_a = 'New value here',
> updated=CURDATE() WHERE primary_key=12345
> 
> I thought that, to keep the table updating to a minimum, this
> approach made the most sense.  However, seeing the two hugely different
> performance times has made me question whether or not it would be
> faster to simply update every field in the table and eliminate all of
> these test conditions.
> 
> And, before someone comments that indexes on the table can cause
> performance hits, I DROP nearly all of the indexes at the start of the
> processing, only keeping those indexes necessary to do the original
> INSERT or the subsequent UPDATE, and then add all of the extra
> "steroid" indexes (you know -- the performance-enhancing ones )
> after all of the INSERTs and UPDATEs have been finished.
> 
> So, long story short (oops -- too late!), what's the concensus
> among the learned assembly here?  Is it faster to just UPDATE the
> record if it already exists regardless of the fact that maybe only one
> or two out of 75 or more fields changed versus testing each one of
> those 75 fields to try and figure out which ones actually changed and
> then only update those?
> 
> I look forward to reading all of your thoughts.
> 
> Sincerely,
> 
> Jon

I don't know about consensus over here because I'm kind of newgie (stands
for new geek, as opposed to newbie which stands for new ball breaker :D :D
). I don't know of your previous messages but I can tell you one story...
Some time ago I got involved in a project that required geo-distance
calculation (you know distance between two points with latitude and
longitude). Basically I had to take a set of points and calculate the
distance of each of those points to a given (reference) one. The math was
something like the "square root of the sum of a constant times the square
sin of..." well, I can't remember it, but the point is, it was a complicated
formula, which I thought it would allow for some optimizations in PHP.
Accustomed to regular (compiled) programming languages I developed a set of
routines to optimize the task and went ahead and queried the database for
the (say, 1000 records) dataset of points. Then applied the math to the
points and the reference point and got the result... in about 5 minutes to
my (disgusting) surprise.
Then I grabbed the MySQL manual, built a "non-optimized" version of the
formula to put directly in the SQL query and get the "shortest distance"
(which was my goal in the end) calculated by MySQL right away. I thought
"ok, I'll prepare a cup of coffee to wait for MySQL to finish the
calculation". To my surprise the query returned the expected result in less
than 2 seconds.
My logic was (wrongly) the following: PHP is a programming language, SQL is
a data access language; I'll get the data using MySQL and do the math using
PHP. But I forgot PHP is an interpreted language, that a number is more than
a number to PHP, but a ZVAL_ object behind the scenes. I forgot
about the memory and the time required to build those objects when one
retrieves data out of a database server. I forgot about parsing time, and
"support logic and safety checks" in the language that overkill any attempt
to build TDCPL (Too Damn Complex Programming Logic) in PHP.
So, now, when I have to do some logic stuff to the retrieved data, I first
check "how much" I can push into the query itself, to get little or nothing
of programming logic in PHP after retrieving (before storing) the data.
All that said, I'd give a shot to the MySQL REPLACE function (I wouldn't
even branch the code to 

Re: [PHP] Performance question

2006-02-01 Thread Chris

Hi,

Any time you fetch results from a database it take up memory, you can't 
do much about that (you can limit the effect by using 'limit' in 
conjunction with paging and only getting the columns you need etc but 
that's about it).



If you're using a standard id/parentid type approach you're stuck with 
recursive sql calls.



If you want to take a different approach you could use a tree/node system.

You end up with stuff like this:

 categoryid |   node   | categoryname |
+--+--+
  1 | 0001 | xxx  |
  2 | 00010002 | 111  |

(categoryid '2' is a subcategory of categoryid '1').

Then you can quickly find the path of a category using just sql.

The downside is moving categories/parents around is a pain.

I haven't tried this myself but having multiple parents you could have a 
separate table for just the joins with the 'node' (read: path) in it and 
it basically includes the path to the end category (which you join with 
your 'category' table to get the name etc).


If you need more info let me know.

Mathieu Dumoulin wrote:

Miles Thompson wrote:


At 12:02 PM 2/1/2006, Mathieu Dumoulin wrote:


This is more a "How would you do it" than a "How can i do it" question.

Didn't have time to try it, but i want to know how mysql_seek_row 
acts with large result sets.


For example im thinking of building a node tree application that can 
have dual direction links to nodes attached to different places.


I was wondering if i could actually have two result sets that query 
everything sorted by ID (Links and Nodes) then just seek the rows i 
need instead of dumping everything in php memory. When i mean large i 
mean really large, more than the standard possible 2 mbs of data 
allowed by most php servers.


That's where the "how you'd do it" comes into play. I think i'd just 
query my tables, loop them but keep only the line (to do a data_seek 
later on) and ID in some kind of an hash table or simply an array. 
This would make it relatively fast without taking too much memory.


This is my solution, how do you people see it without dumping 
everything to memory or by making recursive SQL calls (which will 
obviously slow everything down i'm pretty sure)


Mathieu Dumoulin
Programmer analyst in web solutions
[EMAIL PROTECTED]




I don't get it - what are you trying to do? Why? What is being 
visualized in the tree structure?


Building trees is expensive, you're essentially duplicating the index 
of the database, and if you fetch the data it's in memory regardless 
of how you present it.


The mysql_data_seek needs the row number - but by  then you're on the 
client side, if this is to respond to a link.


Maybe I just don't see it - or maybe you're confusing server side and 
client side.


Regards - Miles Thompson



I'll try to explain my post better since people dont seem to understand 
it. I'm trying to illustrate a node tree client side, everything is 
considered a node in a tree. Now these nodes can appear more than once 
since i allow a node to be linked to different elements (Not just one 
parent, i may need to refer to a node several times.


Considering that a node may be there twice or more, that the number of 
nodes to work with may be quite high, i dont want to start copying into 
a php memory structure everything from the database that i might need... 
Nor do i want to recursively call SQL queries getting childrens of every 
parent. (That would cost a lot to the database server)


So i was wondering what would be the best idea to actually implement 
this... "dilema" of keeping it fast and not using too much memory.


As  i understand your reply Miles, the query im doing resides in the php 
memory even if i don't fetch the data? If you do confirm that i WILL 
have to resort to recursive SQL calls. :(


To finish my post i'd re-explain rapidly my vision of doing this, but 
obviously i may be mistaken, maybe i can't do this. Taking two sql 
queries, one for the nodes, one for the links between nodes, i'd first 
read the results sorted by id, and do a mapping table for each result. 
When i need to use some info, i'd map the id of this info to the mapping 
table and retrieve the row from the result residing in mysql memory. (Up 
to now, this seemed logical until Miles told me that results are kept in 
PHP memory.) If this is still a valid way of doing it great. What would 
be the impact on the sQL server. Would this be a good way? I'd like to 
know from other pros how they'd implement it.


Mathieu Dumoulin



--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP] Performance question

2006-02-01 Thread Mathieu Dumoulin

Miles Thompson wrote:

At 12:02 PM 2/1/2006, Mathieu Dumoulin wrote:


This is more a "How would you do it" than a "How can i do it" question.

Didn't have time to try it, but i want to know how mysql_seek_row acts 
with large result sets.


For example im thinking of building a node tree application that can 
have dual direction links to nodes attached to different places.


I was wondering if i could actually have two result sets that query 
everything sorted by ID (Links and Nodes) then just seek the rows i 
need instead of dumping everything in php memory. When i mean large i 
mean really large, more than the standard possible 2 mbs of data 
allowed by most php servers.


That's where the "how you'd do it" comes into play. I think i'd just 
query my tables, loop them but keep only the line (to do a data_seek 
later on) and ID in some kind of an hash table or simply an array. 
This would make it relatively fast without taking too much memory.


This is my solution, how do you people see it without dumping 
everything to memory or by making recursive SQL calls (which will 
obviously slow everything down i'm pretty sure)


Mathieu Dumoulin
Programmer analyst in web solutions
[EMAIL PROTECTED]



I don't get it - what are you trying to do? Why? What is being 
visualized in the tree structure?


Building trees is expensive, you're essentially duplicating the index of 
the database, and if you fetch the data it's in memory regardless of how 
you present it.


The mysql_data_seek needs the row number - but by  then you're on the 
client side, if this is to respond to a link.


Maybe I just don't see it - or maybe you're confusing server side and 
client side.


Regards - Miles Thompson



I'll try to explain my post better since people dont seem to understand 
it. I'm trying to illustrate a node tree client side, everything is 
considered a node in a tree. Now these nodes can appear more than once 
since i allow a node to be linked to different elements (Not just one 
parent, i may need to refer to a node several times.


Considering that a node may be there twice or more, that the number of 
nodes to work with may be quite high, i dont want to start copying into 
a php memory structure everything from the database that i might need... 
Nor do i want to recursively call SQL queries getting childrens of every 
parent. (That would cost a lot to the database server)


So i was wondering what would be the best idea to actually implement 
this... "dilema" of keeping it fast and not using too much memory.


As  i understand your reply Miles, the query im doing resides in the php 
memory even if i don't fetch the data? If you do confirm that i WILL 
have to resort to recursive SQL calls. :(


To finish my post i'd re-explain rapidly my vision of doing this, but 
obviously i may be mistaken, maybe i can't do this. Taking two sql 
queries, one for the nodes, one for the links between nodes, i'd first 
read the results sorted by id, and do a mapping table for each result. 
When i need to use some info, i'd map the id of this info to the mapping 
table and retrieve the row from the result residing in mysql memory. (Up 
to now, this seemed logical until Miles told me that results are kept in 
PHP memory.) If this is still a valid way of doing it great. What would 
be the impact on the sQL server. Would this be a good way? I'd like to 
know from other pros how they'd implement it.


Mathieu Dumoulin

--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP] Performance question

2006-02-01 Thread Miles Thompson

At 12:02 PM 2/1/2006, Mathieu Dumoulin wrote:


This is more a "How would you do it" than a "How can i do it" question.

Didn't have time to try it, but i want to know how mysql_seek_row acts 
with large result sets.


For example im thinking of building a node tree application that can have 
dual direction links to nodes attached to different places.


I was wondering if i could actually have two result sets that query 
everything sorted by ID (Links and Nodes) then just seek the rows i need 
instead of dumping everything in php memory. When i mean large i mean 
really large, more than the standard possible 2 mbs of data allowed by 
most php servers.


That's where the "how you'd do it" comes into play. I think i'd just query 
my tables, loop them but keep only the line (to do a data_seek later on) 
and ID in some kind of an hash table or simply an array. This would make 
it relatively fast without taking too much memory.


This is my solution, how do you people see it without dumping everything 
to memory or by making recursive SQL calls (which will obviously slow 
everything down i'm pretty sure)


Mathieu Dumoulin
Programmer analyst in web solutions
[EMAIL PROTECTED]


I don't get it - what are you trying to do? Why? What is being visualized 
in the tree structure?


Building trees is expensive, you're essentially duplicating the index of 
the database, and if you fetch the data it's in memory regardless of how 
you present it.


The mysql_data_seek needs the row number - but by  then you're on the 
client side, if this is to respond to a link.


Maybe I just don't see it - or maybe you're confusing server side and 
client side.


Regards - Miles Thompson 



--
No virus found in this outgoing message.
Checked by AVG Anti-Virus.
Version: 7.1.375 / Virus Database: 267.14.25/247 - Release Date: 1/31/2006

--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP] Performance question

2006-02-01 Thread tedd

This is more a "How would you do it" than a "How can i do it" question.

Didn't have time to try it, but i want to know how mysql_seek_row 
acts with large result sets.


For example im thinking of building a node tree application that can 
have dual direction links to nodes attached to different places.


I was wondering if i could actually have two result sets that query 
everything sorted by ID (Links and Nodes) then just seek the rows i 
need instead of dumping everything in php memory. When i mean large 
i mean really large, more than the standard possible 2 mbs of data 
allowed by most php servers.


That's where the "how you'd do it" comes into play. I think i'd just 
query my tables, loop them but keep only the line (to do a data_seek 
later on) and ID in some kind of an hash table or simply an array. 
This would make it relatively fast without taking too much memory.


This is my solution, how do you people see it without dumping 
everything to memory or by making recursive SQL calls (which will 
obviously slow everything down i'm pretty sure)


Mathieu Dumoulin


Mathieu:

I'm not sure what you're asking, but if it is to limit the amount of 
data presented to a user from a search, you could use LIMIT.


The below uses LIMIT:

http://xn--ovg.com/mysql

If you want the code, just ask.

tedd
--

http://sperling.com/

--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP] Performance question

2006-02-01 Thread Mathieu Dumoulin
B trees or binary trees or hash tables or wathever sort algo or memory 
organisation could be just great if i'd put all my data in the page and 
tried or needed to sort it, but i can't do that and don't really need to.


I'm actually searching for a way to load a ton of data from mysql but 
avoiding to transfer everything in php memory to work it out.


I think the b tree would be nice if i had a ton of data to sort but i 
technically don't have lots of element in my node tree on the same 
levels, i could just sort that data in a simple php sort before showing 
it :)


Jochem Maas wrote:

I think this might interest you:

http://www.bluerwhite.org/btree/

then again it may make your ears bleed (because of the Maths :-).

Mathieu Dumoulin wrote:


This is more a "How would you do it" than a "How can i do it" question.

Didn't have time to try it, but i want to know how mysql_seek_row acts 
with large result sets.


For example im thinking of building a node tree application that can 
have dual direction links to nodes attached to different places.


I was wondering if i could actually have two result sets that query 
everything sorted by ID (Links and Nodes) then just seek the rows i 
need instead of dumping everything in php memory. When i mean large i 
mean really large, more than the standard possible 2 mbs of data 
allowed by most php servers.


That's where the "how you'd do it" comes into play. I think i'd just 
query my tables, loop them but keep only the line (to do a data_seek 
later on) and ID in some kind of an hash table or simply an array. 
This would make it relatively fast without taking too much memory.


This is my solution, how do you people see it without dumping 
everything to memory or by making recursive SQL calls (which will 
obviously slow everything down i'm pretty sure)


Mathieu Dumoulin
Programmer analyst in web solutions
[EMAIL PROTECTED]



--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP] Performance question

2006-02-01 Thread Jochem Maas

I think this might interest you:

http://www.bluerwhite.org/btree/

then again it may make your ears bleed (because of the Maths :-).

Mathieu Dumoulin wrote:

This is more a "How would you do it" than a "How can i do it" question.

Didn't have time to try it, but i want to know how mysql_seek_row acts 
with large result sets.


For example im thinking of building a node tree application that can 
have dual direction links to nodes attached to different places.


I was wondering if i could actually have two result sets that query 
everything sorted by ID (Links and Nodes) then just seek the rows i need 
instead of dumping everything in php memory. When i mean large i mean 
really large, more than the standard possible 2 mbs of data allowed by 
most php servers.


That's where the "how you'd do it" comes into play. I think i'd just 
query my tables, loop them but keep only the line (to do a data_seek 
later on) and ID in some kind of an hash table or simply an array. This 
would make it relatively fast without taking too much memory.


This is my solution, how do you people see it without dumping everything 
to memory or by making recursive SQL calls (which will obviously slow 
everything down i'm pretty sure)


Mathieu Dumoulin
Programmer analyst in web solutions
[EMAIL PROTECTED]



--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP] Performance question

2003-06-29 Thread Justin French
on 25/06/03 12:19 PM, Ow Mun Heng ([EMAIL PROTECTED]) wrote:

> Can someone help explain how I can perform a benchmark on the queries or
> whatever?

Write some code, run it many many times, time it with something like Example
1 on http://au.php.net/microtime, then write alternate code, run it many
times, and compare the results.

One will be faster than the other, and you can now make an informed decision
about which is the smarter way to go.  If the difference is tiny, then you
could ask yourself which method is easier for you to code and maintain...

My bet is sessions will win both problems (performance and coding
simplicity), but learning how to benchmark your scripts is a vital learning
step...


Justin French


-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



RE: [PHP] Performance question

2003-06-24 Thread Jeff Harris
On Jun 25, 2003, "Ow Mun Heng" claimed that:

|Can someone help explain how I can perform a benchmark on the queries or
|whatever?
|
|
|Cheers,
|Mun Heng, Ow
|H/M Engineering
|Western Digital M'sia
|DID : 03-7870 5168
|
|
Do it many times and time it.

--
Registered Linux user #304026.
"lynx -source http://jharris.rallycentral.us/jharris.asc | gpg --import"
Key fingerprint = 52FC 20BD 025A 8C13 5FC6  68C6 9CF9 46C2 B089 0FED
Responses to this message should conform to RFC 1855.









\n");
ob_flush();
$timer->start();
$SQLQuery="SELECT * FROM table";
for ($i = 1; $i <= 10; $i++) {
$insert= $dblink->query($SQLQuery);
}

$timer->stop();
print("Total execution time: ");
print($timer-> timeElapsed());
print(" seconds.\n");
?>


-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



RE: [PHP] Performance question

2003-06-24 Thread Ow Mun Heng
Can someone help explain how I can perform a benchmark on the queries or
whatever?


Cheers,
Mun Heng, Ow
H/M Engineering
Western Digital M'sia 
DID : 03-7870 5168


-Original Message-
From: Boaz Yahav [mailto:[EMAIL PROTECTED]
Sent: Tuesday, June 24, 2003 1:41 AM
To: Hardik Doshi; [EMAIL PROTECTED]
Subject: RE: [PHP] Performance question


It depends on your HW / Application / number of visitors.
If you are planning to have many visitors and plan this number to
constantly
grow than you should avoid going to the DB as much as possible.

Sincerely

berber

Visit http://www.weberdev.com/ Today!!!
To see where PHP might take you tomorrow.


-Original Message-
From: Hardik Doshi [mailto:[EMAIL PROTECTED] 
Sent: Monday, June 23, 2003 5:10 PM
To: [EMAIL PROTECTED]
Subject: [PHP] Performance question


Hi Group,

I have a question regarding retrieving the
information. I have the functionlity in which on every
user click, system needs to retrieve information for
particular user and display the page according to the
retrieved information. Now question is which is the
scalable solution? (1) Retrieve information from the
database on each user click. (2) Retrieve information
from the session (here information is retrieved once
and stored in the session file on the server, when
user logs into the system)

Please let me know

Thanks

Hardik

__
Do you Yahoo!?
SBC Yahoo! DSL - Now only $29.95 per month! http://sbc.yahoo.com

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php


--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



RE: [PHP] Performance question

2003-06-23 Thread Boaz Yahav
It depends on your HW / Application / number of visitors.
If you are planning to have many visitors and plan this number to
constantly
grow than you should avoid going to the DB as much as possible.

Sincerely

berber

Visit http://www.weberdev.com/ Today!!!
To see where PHP might take you tomorrow.


-Original Message-
From: Hardik Doshi [mailto:[EMAIL PROTECTED] 
Sent: Monday, June 23, 2003 5:10 PM
To: [EMAIL PROTECTED]
Subject: [PHP] Performance question


Hi Group,

I have a question regarding retrieving the
information. I have the functionlity in which on every
user click, system needs to retrieve information for
particular user and display the page according to the
retrieved information. Now question is which is the
scalable solution? (1) Retrieve information from the
database on each user click. (2) Retrieve information
from the session (here information is retrieved once
and stored in the session file on the server, when
user logs into the system)

Please let me know

Thanks

Hardik

__
Do you Yahoo!?
SBC Yahoo! DSL - Now only $29.95 per month! http://sbc.yahoo.com

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php


--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP] Performance question

2003-06-23 Thread CPT John W. Holmes
> I have a question regarding retrieving the
> information. I have the functionlity in which on every
> user click, system needs to retrieve information for
> particular user and display the page according to the
> retrieved information. Now question is which is the
> scalable solution? (1) Retrieve information from the
> database on each user click. (2) Retrieve information
> from the session (here information is retrieved once
> and stored in the session file on the server, when
> user logs into the system)

Benchmark it and find out yourself.

I'm sure 99% of the people here are going to jump on this and say SESSION is
the way to go, and I'd probably agree with them up front. But the only way
to really tell with your program is to benchmark each possibility. If you're
already establishing a database connection for other uses, then one more
query may not be a big deal versus opening an file and reading it, but
you'll never know the real answer until you try both of them. Now, if you
only use the database connection to get this information, then yeah,
sessions will definitely be faster when you remove the database connection
entirely.

---John Holmes...


-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



RE: [PHP] Performance Question

2003-03-04 Thread Niklas Lampén
I think that file_get_contents() is quicker, since include() runs what it
gets as normal php code. And that gives you the answer to the other question
as well. :)


Niklas


-Original Message-
From: Patrick Teague [mailto:[EMAIL PROTECTED] 
Sent: 4. maaliskuuta 2003 10:57
To: [EMAIL PROTECTED]
Subject: [PHP] Performance Question


What's the difference in performance between

print( file_get_contents("myfile.html") );

and

include("myfile.html");

is there any particular reason for using one over the other?

Patrick



-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php

###
This message has been scanned by F-Secure Anti-Virus for Internet Mail. For
more information, connect to http://www.F-Secure.com/

###
This message has been scanned by F-Secure Anti-Virus for Internet Mail.
For more information, connect to http://www.F-Secure.com/

--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



RE: [PHP] Performance question

2002-04-16 Thread James E. Hicks III

I think you've got the best set up already. I have a PDF library that I do a
similar thing with. To update the site I just dump the new PDF's into their
directory and the users reload the page to see the new content.

James

-Original Message-
From: Fifield, Mike [mailto:[EMAIL PROTECTED]]
Sent: Tuesday, April 16, 2002 11:22 AM
To: [EMAIL PROTECTED]
Subject: [PHP] Performance question


Ok I have a performance question. I run a wallpaper site; this is the url to
the main gallery page http://www.cognitivedistortion.com/php/gal.php
  The page does a dir read
and lists all pictures found in each directory, it does this every time the
page is loaded. Now at first this seemed like a incredible inefficient way
to do generate pages. The alternatives to this would be to generate a list
of the pictures and save them in a text file, or create a list of the
pictures and store them in a database. If I use a text file I will have to
read the text file every time the page is generated which seems like it
would take just as long as reading the file allocation table, which I am
doing now. As for the database to get the same thing done I would first have
to log into the database and then pull the information.
I was hoping to get some input as to which would be the fastest/ best way to
use.


Mike Fifield
Charles Schwab & Co, Inc.
WARNING: All e-mail sent to or from this address will be received by the
Charles Schwab corporate e-mail system and is subject to archival and review
by someone other than the recipient.



-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php




Re: [PHP] Performance question

2001-10-18 Thread Rasmus Lerdorf

> Hi!
>
> I have a page that uses server-side includes to display different
> features. Something like this:
> 
>(a few db calls)
>(perl-script with db calls)
>(even more db calls)
> 
>
> If everything is embedded in a php-page I assume that the parser would
> start only once, and that the db-connections would be open until the
> whole page closes?
>
> If the assumption above is correct, does the same apply to a .shtml-page?
> Or does the parser execute eache php-script one at the time, each time
> invoking the parser?

include virtual invokes an Apache sub-request.  It is basically like
sending another GET request back to Apache.  It goes through all the same
layers and the php or cgi script that is run sees it as a completely
separate request.  It has no idea it is part of a single .shtml page.  So
no, there can't be any sharing of resources across multiple include
virtuals from a .shtml page.

I would suggest doing the bulk in PHP and if you need a Perl cgi run, use
PHP's virtual() function which does the same as include virtual in a
.shtml page.

-Rasmus


-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
To contact the list administrators, e-mail: [EMAIL PROTECTED]




Re: [PHP] Performance question

2001-02-02 Thread Steve Werby

"Matthew Mundy" <[EMAIL PROTECTED]> wrote:
> I was wondering.  What kind of performance reduction is there in including
> files or using the auto prepended one for a file less than, say, 10 line?
> Specifically, I would think that the file IO would be a detriment to such
> a small file.  Without the includes, code would be repeated, but in PHP,
> IMHO, speed is a much more important factor than code size.  Anyways, what
> would be the performance factors in deciding what goes in an include vs
> inline code?

I don't think you'll find a noticable difference in speed.  To me the
benefit of a modular design outweighs the effect on speed.  As a rule of
thumb, I take any code that will be used in multiple files and store in a
separate file, but development speed is my most critical constraint.
Reduced file size is a nice side effect, but storage space is so cheap that
really doesn't matter.  And if you want to increase the speed from the user
perspective you may want to look at a combination of caching and sending
gzipped data to browsers that support it.  There was a tutorial on
phpbuilder.com recently about this strategy and with the exception of a few
obstacles it works nicely depending on what your output is like.  You might
want to benchmark including the code inline versus using include() or auto
prepending it by writing a program to simulate expected (or heavy) usage of
the script in question and using something similar to the code below to
track the time that the script took to run.  On small simple programs the
results will be skewed by the time it takes to actually run the code below,
but on longer, more complex scripts that will be neglible when comparing
your test cases.



--
Steve Werby
COO
24-7 Computer Services, LLC
Tel: 804.817.2470
http://www.247computing.com/


-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
To contact the list administrators, e-mail: [EMAIL PROTECTED]




Re: [PHP] Performance question

2001-01-11 Thread Chris Adams


""Chris Lee"" <[EMAIL PROTECTED]> wrote in message
93l3qd$p98$[EMAIL PROTECTED]">news:93l3qd$p98$[EMAIL PROTECTED]...
> has anyone ran any test to see what kind of performance hit? I know people
> are allways ranting and raving about single or double quotes regarding
> performance and people ranting and raving about HTML in or out of echo
> statements.

I tend to agree with you on this, but I will note that the timing problems
seem to become more of an issue with long strings. Recently I ran a
moderately complex function in a benchmark where doing the obsessive things
like switching from double-quotes to single quotes gave a double-digit
percentage speed-up. However, to put things in perspective, modifying the
code to eliminate data copies and streamline if()s resulted in something
like a 900% speedup.

By that time, the point was made - the Cold Fusion it was being compared to
took 125 times longer to run and wasn't as featureful...

Chris



-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
To contact the list administrators, e-mail: [EMAIL PROTECTED]




Re: [PHP] Performance question

2001-01-11 Thread Chris Lee

has anyone ran any test to see what kind of performance hit? I know people
are allways ranting and raving about single or double quotes regarding
performance and people ranting and raving about HTML in or out of echo
statements.

the performance hit just isnt amazing, use what you want I figure, use whats
easier to read. IMHO the last example is the hardest to read.

please please everyone post your own results wether you wan tto back me or
slash me.



[lee@server lee]$ time php -q test_1.php > /dev/null
23.82user 0.07system 0:25.76elapsed 92%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (448major+2606minor)pagefaults 0swaps

[lee@server lee]$ time php -q test_1.php > /dev/null
23.19user 0.09system 0:24.03elapsed 96%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (447major+2606minor)pagefaults 0swaps

[lee@server lee]$ time php -q test_1.php > /dev/null
23.27user 0.14system 0:26.84elapsed 87%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (447major+2606minor)pagefaults 0swaps

//-



[lee@server lee]$ time php -q test_1.php > /dev/null
23.24user 0.01system 0:23.25elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (448major+2606minor)pagefaults 0swaps

[lee@server lee]$ time php -q test_1.php > /dev/null
23.17user 0.10system 0:23.28elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (447major+2606minor)pagefaults 0swaps

[lee@server lee]$ time php -q test_1.php > /dev/null
23.17user 0.08system 0:23.25elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (447major+2606minor)pagefaults 0swaps

//---

A

[lee@server lee]$ time php -q test_1.php > /dev/null
23.26user 0.05system 0:23.58elapsed 98%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (757major+2606minor)pagefaults 0swaps

[lee@server lee]$ time php -q test_1.php > /dev/null
23.21user 0.06system 0:23.28elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (447major+2606minor)pagefaults 0swaps

[lee@server lee]$ time php -q test_1.php > /dev/null
23.27user 0.02system 0:23.34elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (447major+2606minor)pagefaults 0swaps

Chris Lee
Mediawaveonline.com





"Ignacio Vazquez-Abrams" <[EMAIL PROTECTED]> wrote in message
[EMAIL PROTECTED]">news:[EMAIL PROTECTED]...
> On Thu, 11 Jan 2001, John Guynn wrote:
>
> > Do I pay a performance penality for calling a file that only contains
html
> > code .php or .php3?  In otherwords what is the php parser overhead if
there
> > is no php code in the file?
> >
> > John Guynn
> >
> > This email brought to you by RFCs 821 and 1225.
> >
>
> Yes, because that file will be put through the parser regardless of
whether or
> not there's any PHP in it.
>
> --
> Ignacio Vazquez-Abrams  <[EMAIL PROTECTED]>
>
>
> --
> PHP General Mailing List (http://www.php.net/)
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
> To contact the list administrators, e-mail: [EMAIL PROTECTED]
>



-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
To contact the list administrators, e-mail: [EMAIL PROTECTED]




Re: [PHP] Performance question

2001-01-11 Thread Steve Edberg

At 2:22 PM -0500 1/11/01, Ignacio Vazquez-Abrams wrote:
>On Thu, 11 Jan 2001, John Guynn wrote:
>
> > Do I pay a performance penality for calling a file that only contains html
> > code .php or .php3?  In otherwords what is the php parser overhead if there
> > is no php code in the file?
> >
> > John Guynn
> >
> > This email brought to you by RFCs 821 and 1225.
> >
>
>Yes, because that file will be put through the parser regardless of whether or
>not there's any PHP in it.


The extent of this overhead depends on whether you're running PHP as 
a module or as a CGI. With CGI, the overhead is quite large, probably 
too much unless you have a really meaty server or a lightly-hit web 
site. For an Apache module, I recall a test someone ran > 1 year ago 
that resulted in an extra ~4 % load for parsing plain html pages 
through php (as opposed to just statically serving the html pages). 
This person had set his server to php-parse the .html extension.

- steve


>--
>Ignacio Vazquez-Abrams  <[EMAIL PROTECTED]>
>
 

+--- "They've got a cherry pie there, that'll kill ya" --+
| Steve Edberg   University of California, Davis |
| [EMAIL PROTECTED]   Computer Consultant |
| http://aesric.ucdavis.edu/  http://pgfsun.ucdavis.edu/ |
+-- FBI Special Agent Dale Cooper ---+

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
To contact the list administrators, e-mail: [EMAIL PROTECTED]




Re: [PHP] Performance question

2001-01-11 Thread Ignacio Vazquez-Abrams

On Thu, 11 Jan 2001, John Guynn wrote:

> Do I pay a performance penality for calling a file that only contains html
> code .php or .php3?  In otherwords what is the php parser overhead if there
> is no php code in the file?
>
> John Guynn
>
> This email brought to you by RFCs 821 and 1225.
>

Yes, because that file will be put through the parser regardless of whether or
not there's any PHP in it.

-- 
Ignacio Vazquez-Abrams  <[EMAIL PROTECTED]>


-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
To contact the list administrators, e-mail: [EMAIL PROTECTED]