Thank you very much.
Steve Holden, I post my soucecode at my blog here:
http://hiparrot.wordpress.com/2005/12/08/implementing-a-simple-net-spider/
I wish you can read and give me some suggestion.
Any comments will be appreciated.

On 12/2/05, Steve Holden < [EMAIL PROTECTED]> wrote:
could ildg wrote:
> In java and C# String is immutable, str=str+"some more" will return a
> new string and leave some gargabe.
> so in java and C# if there are some frequent string operation,
> StringBuilder/StringBuffer is recommanded.
>
> Will string operation in python also leave some garbage? I implemented a
> net-spider in python which includes many html string procession. After
> it running for sometime, the python exe eats up over 300M memory. Is
> this because the string garbages?
>
If you create garbage in a Python program it will normally be collected
and returned to free memory by the garbage collector, which should be
run when memory is exhausted in preference to allocating more memory.
Additional memory should therefore only be claimed when garbage
collection fails to return sufficient free space.

If cyclic data structures are created (structures in which components
refer to each other even though no external references exist) this could
cause problems in older versions of Python, but nowadays the garbage
collector also takes pains to collect unreferenced cyclic structures.

> If String in python is immutable, what class should I use to avoid too
> much garbages when processing strings frequently?
>
The fact that your process uses 300MB implies that you are retaining
references to a large amount of data. Without seeing the code, however,
it's difficult to suggest how you might improve the situation. Are you,
for example, holding the HTML for every spidered page?

As a side note, both C# and Java also use garbage collection, so if your
algorithm exhibits the same problem in all three languages this merely
confirms that the problem really is your algorithm, and not the language
in which it is implemented.

regards
  Steve
--
Steve Holden       +44 150 684 7255  +1 800 494 3119
Holden Web LLC                     www.holdenweb.com
PyCon TX 2006                   www.python.org/pycon/

--
http://mail.python.org/mailman/listinfo/python-list

-- 
http://mail.python.org/mailman/listinfo/python-list

Reply via email to