[issue9942] Allow memory sections to be OS MERGEABLE

2010-09-25 Thread Kevin Hunter

Kevin Hunter hunt...@earlham.edu added the comment:

 Well, first, this would only work for large objects. [...]
 Why do you think you might have such duplication in your workload?

Some of the projects with which I work involve multiple manipulations of large 
datasets.  Often, we use Python scripts as first and third stages in a 
pipeline.  For example, in one current workflow, we read a large file into a 
cStringIO object, do a few manipulations with it, pass it off to a second 
process, and await the results.  Meanwhile, the large file is sitting around in 
memory because we need to do more manipulations after we get results back from 
the second application in the pipeline.  Graphically:

Python Script A-External App-Python Script A
read large data  process data  more manipulations

Within a single process, I don't see any gain to be had.  However, in this one 
use-case, this pipeline is running concurrently with a number of copies with 
slightly different command line parameters.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue9942
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue9942] Allow memory sections to be OS MERGEABLE

2010-09-25 Thread Kevin Hunter

Kevin Hunter hunt...@earlham.edu added the comment:

 Why do you read it into a cStringIO? A cStringIO has the same interface
 as a file, so you could simply operate on the file directly.

In that particular case, because it isn't actually a file.  That workflow was 
my attempt at simplification to illustrate a point.

I think the point is moot however, as I've gotten what I needed from this 
feature request/discussion.  Not one, but three Python developers seem opposed 
to the idea, or at least skeptical.  That's enough to tell me that my 
first-order supposition that Python objects could be MERGEABLE is not on target.

Cheers.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue9942
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue9942] Allow memory sections to be OS MERGEABLE

2010-09-24 Thread Kevin Hunter

New submission from Kevin Hunter hunt...@earlham.edu:

Should Python enable a way for folks to inform the OS of MADV_MERGEABLE memory?

I can't speak for other OSs, but Linux added the ability for processes to 
inform the kernel that they have memory that will likely not change for a while 
in 2.6.32.  This is done through the madvise syscall with MADV_MERGEABLE.

http://www.kernel.org/doc/Documentation/vm/ksm.txt

After initial conversations in IRC, it was suggested that this would be 
difficult in the Python layer, but that the OS doesn't care what byte page it's 
passed as mergeable.  Thus when I, as an application programmer, know that I 
have some objects that will be around for awhile, and that won't change, I 
can let the OS know that it might be beneficial to merge them.

I suggest this might be a library because it may only be useful for certain 
projects.

--
components: Library (Lib)
messages: 117317
nosy: hunteke
priority: normal
severity: normal
status: open
title: Allow memory sections to be OS MERGEABLE
type: feature request

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue9942
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue9942] Allow memory sections to be OS MERGEABLE

2010-09-24 Thread Kevin Hunter

Kevin Hunter hunt...@earlham.edu added the comment:

My first thought is Why is the reference counter stored with the object 
itself?  I imagine there are very good reasons, however, and this is not an 
area in which I have much mastery.

Answering the question as best I can: I don't know how the reference counter is 
implemented in CPython, but if it's just a field in a struct, then madvise 
could be sent the memory location starting with the byte immediately following 
the reference counter.

If there's more to it than that, I'll have to back off with I don't know.  
I'm perhaps embarrassed that I'm not at all a Python developer, merely a Python 
application developer.  I have a few Python projects that are memory hungry, 
that at first glance I believe to be creating MERGEABLE objects.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue9942
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com