On 3 Sep 2013, at 02:30, Daevid Vincent <dae...@daevid.com> wrote:

> I'm confused on how a reference works I think.
> 
> I have a DB result set in an array I'm looping over. All I simply want to do
> is make the array key the "id" of the result set row.
> 
> This is the basic gist of it:
> 
>       private function _normalize_result_set()
>       {
>              foreach($this->tmp_results as $k => $v)
>              {
>                     $id = $v['id'];
>                     $new_tmp_results[$id] =& $v; //2013-08-29 [dv] using a
> reference here cuts the memory usage in half!

You are assigning a reference to $v. In the next iteration of the loop, $v will 
be pointing at the next item in the array, as will the reference you're storing 
here. With this code I'd expect $new_tmp_results to be an array where the keys 
(i.e. the IDs) are correct, but the data in each item matches the data in the 
last item from the original array, which appears to be what you describe.

>                     unset($this->tmp_results[$k]);

Doing this for every loop is likely very inefficient. I don't know how the 
inner workings of PHP process something like this, but I wouldn't be surprised 
if it's allocating a new chunk of memory for a version of the array without 
this element. You may find it better to not unset anything until the loop has 
finished, at which point you can just unset($this->tmp_results).

> 
>                     /*
>                     if ($i++ % 1000 == 0)
>                     {
>                           gc_enable(); // Enable Garbage Collector
>                           var_dump(gc_enabled()); // true
>                           var_dump(gc_collect_cycles()); // # of elements
> cleaned up
>                           gc_disable(); // Disable Garbage Collector
>                     }
>                     */
>              }
>              $this->tmp_results = $new_tmp_results;
>              //var_dump($this->tmp_results); exit;
>              unset($new_tmp_results);
>       }


Try this:

private function _normalize_result_set()
{
  // Initialise the temporary variable.
  $new_tmp_results = array();

  // Loop around just the keys in the array.
  foreach (array_keys($this->tmp_results) as $k)
  {
    // Store the item in the temporary array with the ID as the key.
    // Note no pointless variable for the ID, and no use of &!
    $new_tmp_results[$this->tmp_results[$k]['id']] = $this->tmp_results[$k];
  }

  // Assign the temporary variable to the original variable.
  $this->tmp_results = $new_tmp_results;
}

I'd appreciate it if you could plug this in and see what your memory usage 
reports say. In most cases, trying to control the garbage collection through 
the use of references is the worst way to go about optimising your code. In my 
code above I'm relying on PHPs copy-on-write feature where data is only 
duplicated when assigned if it changes. No unsets, just using scope to mark a 
variable as able to be cleaned up.

Where is this result set coming from? You'd save yourself a lot of memory/time 
by putting the data in to this format when you read it from the source. For 
example, if reading it from MySQL, $this->tmp_results[$row['id']] = $row when 
looping around the result set.

Also, is there any reason why you need to process this full set of data in one 
go? Can you not break it up in to smaller pieces that won't put as much strain 
on resources?

-Stuart

-- 
Stuart Dallas
3ft9 Ltd
http://3ft9.com/

--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to