Re: [Python-Dev] Split unicodeobject.c into subfiles?

2012-11-17 Thread Chris Jerdonek
[Apologies for resurrecting a few-weeks old thread.] On Thu, Oct 4, 2012 at 2:46 PM, mar...@v.loewis.de wrote: Zitat von Victor Stinner victor.stin...@gmail.com: I only see one argument against such refactoring: it will be harder to backport/forwardport bugfixes. I'm opposed for a

Re: [Python-Dev] Split unicodeobject.c into subfiles?

2012-11-17 Thread Chris Angelico
On Sun, Nov 18, 2012 at 5:47 AM, Chris Jerdonek chris.jerdo...@gmail.com wrote: On Thu, Oct 4, 2012 at 2:46 PM, mar...@v.loewis.de wrote: I really fail to see what problem people have with large source files. What is it that you want to do that can be done easier if it's multiple files? One

Re: [Python-Dev] Split unicodeobject.c into subfiles?

2012-11-17 Thread Chris Jerdonek
On Sat, Nov 17, 2012 at 10:55 AM, Chris Angelico ros...@gmail.com wrote: On Sun, Nov 18, 2012 at 5:47 AM, Chris Jerdonek chris.jerdo...@gmail.com wrote: On Thu, Oct 4, 2012 at 2:46 PM, mar...@v.loewis.de wrote: I really fail to see what problem people have with large source files. What is it

Re: [Python-Dev] Split unicodeobject.c into subfiles

2012-10-25 Thread Nick Coghlan
On Thu, Oct 25, 2012 at 2:22 PM, Stephen J. Turnbull step...@xemacs.org wrote: Nick Coghlan writes: OK, I need to weigh in after seeing this kind of reply. Large source files are discouraged in general because they're a code smell that points strongly towards a *lack of modularity*

Re: [Python-Dev] Split unicodeobject.c into subfiles

2012-10-25 Thread M.-A. Lemburg
On 25.10.2012 08:42, Nick Coghlan wrote: Why are any of these codecs here in unicodeobjectland in the first place? Sure, they're needed so that Python can find its own stuff, but in principle *any* codec could be needed. Is it just an heuristic that the codecs needed for 99% of the world are

Re: [Python-Dev] Split unicodeobject.c into subfiles

2012-10-25 Thread M.-A. Lemburg
On 25.10.2012 08:42, Nick Coghlan wrote: unicodeobject.c is too big, and should be restructured to make any natural modularity explicit, and provide an easier path for users that want to understand how the unicode implementation works. You can also achieve that goal by structuring the code in

Re: [Python-Dev] Split unicodeobject.c into subfiles

2012-10-25 Thread Maciej Fijalkowski
On Thu, Oct 25, 2012 at 8:57 AM, M.-A. Lemburg m...@egenix.com wrote: On 25.10.2012 08:42, Nick Coghlan wrote: Why are any of these codecs here in unicodeobjectland in the first place? Sure, they're needed so that Python can find its own stuff, but in principle *any* codec could be needed.

Re: [Python-Dev] Split unicodeobject.c into subfiles

2012-10-25 Thread M.-A. Lemburg
On 25.10.2012 11:18, Maciej Fijalkowski wrote: On Thu, Oct 25, 2012 at 8:57 AM, M.-A. Lemburg m...@egenix.com wrote: On 25.10.2012 08:42, Nick Coghlan wrote: Why are any of these codecs here in unicodeobjectland in the first place? Sure, they're needed so that Python can find its own stuff,

Re: [Python-Dev] Split unicodeobject.c into subfiles

2012-10-25 Thread Serhiy Storchaka
On 25.10.12 12:18, Maciej Fijalkowski wrote: I challenge you to find a benchmark that is being significantly affected (15%) with the split proposed by Victor. It does not even have to be a real-world one, although that would definitely buy it more credibility. I see 10% slowdown for UTF-8

Re: [Python-Dev] Split unicodeobject.c into subfiles

2012-10-25 Thread Maciej Fijalkowski
I think you misunderstood. What I described is the reason for having the base codecs in unicodeobject.c. I think we all agree that inlining has a positive effect on performance. The scale of the effect depends on the used compiler and platform. Well. Inlining can have positive or negative

Re: [Python-Dev] Split unicodeobject.c into subfiles

2012-10-25 Thread Serhiy Storchaka
On 25.10.12 12:49, M.-A. Lemburg wrote: I think you misunderstood. What I described is the reason for having the base codecs in unicodeobject.c. For example PyUnicode_FromStringAndSize and PyUnicode_FromString are thin wrappers around PyUnicode_DecodeUTF8Stateful. I think this is a reason

Re: [Python-Dev] Split unicodeobject.c into subfiles

2012-10-25 Thread Nick Coghlan
On Thu, Oct 25, 2012 at 8:07 PM, Maciej Fijalkowski fij...@gmail.com wrote: I think you misunderstood. What I described is the reason for having the base codecs in unicodeobject.c. I think we all agree that inlining has a positive effect on performance. The scale of the effect depends on the

Re: [Python-Dev] Split unicodeobject.c into subfiles

2012-10-25 Thread Antoine Pitrou
Le 25/10/2012 02:03, Nick Coghlan a écrit : speed.python.org is also making progress, and once that is up and running (which will happen well before any Python 3.4 release) it will be possible to compare the numbers between 3.3 and trunk to help determine the validity of any concerns regarding

Re: [Python-Dev] Split unicodeobject.c into subfiles

2012-10-25 Thread Antoine Pitrou
Le 25/10/2012 00:15, Nick Coghlan a écrit : However, -1 on the faux modularity idea of breaking up the files on disk, but still exposing them to the compiler and linker as a monolithic block, though. That would be completely missing the point of why large source files are bad. I disagree with

Re: [Python-Dev] Split unicodeobject.c into subfiles

2012-10-25 Thread Larry Hastings
On 10/24/2012 03:15 PM, Nick Coghlan wrote: Breaking such files up into separately compiled modules serves two purposes: 1. It proves that the code *isn't* a tangled monolithic mess; 2. It enlists the compilation toolchain's assistance in ensuring that remains the case in the future.

Re: [Python-Dev] Split unicodeobject.c into subfiles

2012-10-25 Thread Antoine Pitrou
On Thu, 25 Oct 2012 08:13:53 -0700 Larry Hastings la...@hastings.org wrote: I'm all for good software engineering practice. But can you cite objective reasons why large source files are provably bad? Not tangled monolithic messes, not poorly-factored code. I agree that those are

Re: [Python-Dev] Split unicodeobject.c into subfiles

2012-10-25 Thread Stephen J. Turnbull
Antoine Pitrou writes: Well, tangled monolithic mess is quite true about unicodeobject.c, IMO. s/object.c// and your point remains valid. Just reading the table of contents for UTR#17 (http://www.unicode.org/reports/tr17/) should convince you that it's not going to be easy to produce an

Re: [Python-Dev] Split unicodeobject.c into subfiles

2012-10-24 Thread Larry Hastings
On 10/23/2012 09:29 AM, Georg Brandl wrote: Especially since you're suggesting a huge number of new files, I question the argument of better navigability. FWIW I'm -1 on it too. I don't see what the big deal is with large source files. If you have difficulty finding your way around

Re: [Python-Dev] Split unicodeobject.c into subfiles

2012-10-24 Thread Nick Coghlan
On Oct 25, 2012 2:06 AM, Larry Hastings la...@hastings.org wrote: On 10/23/2012 09:29 AM, Georg Brandl wrote: Especially since you're suggesting a huge number of new files, I question the argument of better navigability. FWIW I'm -1 on it too. I don't see what the big deal is with large

Re: [Python-Dev] Split unicodeobject.c into subfiles

2012-10-24 Thread Barry Warsaw
On Oct 25, 2012, at 08:15 AM, Nick Coghlan wrote: OK, I need to weigh in after seeing this kind of reply. Large source files are discouraged in general because they're a code smell that points strongly towards a *lack of modularity* within a *complex piece of functionality*. Modularity is good,

Re: [Python-Dev] Split unicodeobject.c into subfiles

2012-10-24 Thread Nick Coghlan
On Thu, Oct 25, 2012 at 8:37 AM, Barry Warsaw ba...@python.org wrote: On Oct 25, 2012, at 08:15 AM, Nick Coghlan wrote: OK, I need to weigh in after seeing this kind of reply. Large source files are discouraged in general because they're a code smell that points strongly towards a *lack of

Re: [Python-Dev] Split unicodeobject.c into subfiles

2012-10-24 Thread Stephen J. Turnbull
Nick Coghlan writes: OK, I need to weigh in after seeing this kind of reply. Large source files are discouraged in general because they're a code smell that points strongly towards a *lack of modularity* within a *complex piece of functionality*. Sure, but large numbers of tiny source

Re: [Python-Dev] Split unicodeobject.c into subfiles

2012-10-23 Thread Benjamin Peterson
2012/10/22 Victor Stinner victor.stin...@gmail.com: Hi, I forked CPython repository to work on my split unicodeobject.c project: http://hg.python.org/sandbox/split-unicodeobject.c The result is 10 files (included the existing unicodeobject.c): 1176 Objects/unicodecharmap.c 1678

Re: [Python-Dev] Split unicodeobject.c into subfiles

2012-10-23 Thread M.-A. Lemburg
On 23.10.2012 10:22, Benjamin Peterson wrote: 2012/10/22 Victor Stinner victor.stin...@gmail.com: Hi, I forked CPython repository to work on my split unicodeobject.c project: http://hg.python.org/sandbox/split-unicodeobject.c The result is 10 files (included the existing unicodeobject.c):

Re: [Python-Dev] Split unicodeobject.c into subfiles

2012-10-23 Thread Victor Stinner
Such a restructuring should not result in compilers no longer being able to optimize code by inlining functions in one of the most important basic types we have in Python 3. I agree that performances are important. But I'm not convinced than moving functions has a real impact on performances,

Re: [Python-Dev] Split unicodeobject.c into subfiles

2012-10-23 Thread Antoine Pitrou
Le 23/10/2012 12:05, Victor Stinner a écrit : Such a restructuring should not result in compilers no longer being able to optimize code by inlining functions in one of the most important basic types we have in Python 3. I agree that performances are important. But I'm not convinced than moving

Re: [Python-Dev] Split unicodeobject.c into subfiles

2012-10-23 Thread Amaury Forgeot d'Arc
2012/10/23 Antoine Pitrou solip...@pitrou.net: I agree with Marc-André, there's no point in compiling those files separately. #include'ing them in the master unicodeobject.c file is fine. I also find the unicodeobject.c difficult to navigate. Even if we don't split the file, I'd advocate a

Re: [Python-Dev] Split unicodeobject.c into subfiles?

2012-10-07 Thread Victor Stinner
The amount of code will not be reduced, but now you also need to guess what file some piece of functionality may be in. How do you search a piece of code? If you search for a function by its name, it does not matter in which file it is defined if you an IDE or vim/emacs with a correct

Re: [Python-Dev] Split unicodeobject.c into subfiles?

2012-10-07 Thread Benjamin Peterson
2012/10/7 Victor Stinner victor.stin...@gmail.com: Another problem with huge files is to handle dependencies with static functions. If the function A calls the function B which calls the function C, you have to order A, B and C correctly if these functions are private and not declared at the

Re: [Python-Dev] Split unicodeobject.c into subfiles?

2012-10-07 Thread Chris Angelico
On Mon, Oct 8, 2012 at 8:17 AM, Victor Stinner victor.stin...@gmail.com wrote: Another problem with huge files is to handle dependencies with static functions. If the function A calls the function B which calls the function C, you have to order A, B and C correctly if these functions are

Re: [Python-Dev] Split unicodeobject.c into subfiles?

2012-10-07 Thread martin
Zitat von Victor Stinner victor.stin...@gmail.com: The amount of code will not be reduced, but now you also need to guess what file some piece of functionality may be in. How do you search a piece of code? I type /pattern in vim, or Ctrl-s (incremental search) in Emacs. If you search for

Re: [Python-Dev] Split unicodeobject.c into subfiles?

2012-10-05 Thread M.-A. Lemburg
Victor Stinner wrote: Hi, I would like to split the huge unicodeobject.c file into smaller files. It's just the longest C file of CPython: 14,849 lines. I don't know exactly how to split it, but first I would like to know if you would agree with the idea. Example: -

Re: [Python-Dev] Split unicodeobject.c into subfiles?

2012-10-05 Thread Chris Jerdonek
On Thu, Oct 4, 2012 at 6:49 PM, Stephen J. Turnbull step...@xemacs.org wrote: Chris Jerdonek writes: You can create multiple files this way. I just verified it. But the problem happens with merging. You will create merge conflicts in the deleted portions of every split file on every

[Python-Dev] Split unicodeobject.c into subfiles?

2012-10-04 Thread Victor Stinner
Hi, I would like to split the huge unicodeobject.c file into smaller files. It's just the longest C file of CPython: 14,849 lines. I don't know exactly how to split it, but first I would like to know if you would agree with the idea. Example: - Objects/unicode/codecs.c -

Re: [Python-Dev] Split unicodeobject.c into subfiles?

2012-10-04 Thread Andrew Svetlov
I like the idea. From my perspective better to use subdirectory to sake of easy finding in grep style. On Thu, Oct 4, 2012 at 11:30 PM, Victor Stinner victor.stin...@gmail.com wrote: Hi, I would like to split the huge unicodeobject.c file into smaller files. It's just the longest C file of

Re: [Python-Dev] Split unicodeobject.c into subfiles?

2012-10-04 Thread Benjamin Peterson
2012/10/4 Victor Stinner victor.stin...@gmail.com: I only see one argument against such refactoring: it will be harder to backport/forwardport bugfixes. I imagine it could also prevent inlining of hot paths. -- Regards, Benjamin ___ Python-Dev

Re: [Python-Dev] Split unicodeobject.c into subfiles?

2012-10-04 Thread Victor Stinner
2012/10/4 Benjamin Peterson benja...@python.org: 2012/10/4 Victor Stinner victor.stin...@gmail.com: I only see one argument against such refactoring: it will be harder to backport/forwardport bugfixes. I imagine it could also prevent inlining of hot paths. It depends how the code is

Re: [Python-Dev] Split unicodeobject.c into subfiles?

2012-10-04 Thread Benjamin Peterson
2012/10/4 Victor Stinner victor.stin...@gmail.com: 2012/10/4 Benjamin Peterson benja...@python.org: 2012/10/4 Victor Stinner victor.stin...@gmail.com: I only see one argument against such refactoring: it will be harder to backport/forwardport bugfixes. I imagine it could also prevent

Re: [Python-Dev] Split unicodeobject.c into subfiles?

2012-10-04 Thread Chris Jerdonek
On Thu, Oct 4, 2012 at 1:30 PM, Victor Stinner victor.stin...@gmail.com wrote: I would like to split the huge unicodeobject.c file into smaller files. It's just the longest C file of CPython: 14,849 lines. ... I only see one argument against such refactoring: it will be harder to

Re: [Python-Dev] Split unicodeobject.c into subfiles?

2012-10-04 Thread Victor Stinner
I am not siding with either side of the change yet, but an additional argument against is that history may become less convenient to navigate and track (e.g. hg annotate may lose information depending on how the split is done). If new files are created using hg cp unicodeobject.c

Re: [Python-Dev] Split unicodeobject.c into subfiles?

2012-10-04 Thread Benjamin Peterson
2012/10/4 Victor Stinner victor.stin...@gmail.com: I am not siding with either side of the change yet, but an additional argument against is that history may become less convenient to navigate and track (e.g. hg annotate may lose information depending on how the split is done). If new files

Re: [Python-Dev] Split unicodeobject.c into subfiles?

2012-10-04 Thread Victor Stinner
2012/10/5 Benjamin Peterson benja...@python.org: 2012/10/4 Victor Stinner victor.stin...@gmail.com: If new files are created using hg cp unicodeobject.c unicode/newfile.c, the historic is kept. Yes, but you can only create one file that way. You can create as many files as you want. Try: ---

Re: [Python-Dev] Split unicodeobject.c into subfiles?

2012-10-04 Thread Chris Jerdonek
On Thu, Oct 4, 2012 at 4:31 PM, Benjamin Peterson benja...@python.org wrote: 2012/10/4 Victor Stinner victor.stin...@gmail.com: I am not siding with either side of the change yet, but an additional argument against is that history may become less convenient to navigate and track (e.g. hg

Re: [Python-Dev] Split unicodeobject.c into subfiles?

2012-10-04 Thread Eric V. Smith
On 10/4/2012 4:30 PM, Victor Stinner wrote: Hi, I would like to split the huge unicodeobject.c file into smaller files. It's just the longest C file of CPython: 14,849 lines. What problem are you trying to solve? -- Eric. ___ Python-Dev mailing

Re: [Python-Dev] Split unicodeobject.c into subfiles?

2012-10-04 Thread Antoine Pitrou
On Thu, 04 Oct 2012 23:46:57 +0200 mar...@v.loewis.de wrote: Zitat von Victor Stinner victor.stin...@gmail.com: I only see one argument against such refactoring: it will be harder to backport/forwardport bugfixes. I'm opposed for a different reason: I think it will be *harder* to

Re: [Python-Dev] Split unicodeobject.c into subfiles?

2012-10-04 Thread Stephen J. Turnbull
Chris Jerdonek writes: You can create multiple files this way. I just verified it. But the problem happens with merging. You will create merge conflicts in the deleted portions of every split file on every merge. There may be a way to avoid this that I don't know about though (i.e. to

Re: [Python-Dev] Split unicodeobject.c into subfiles?

2012-10-04 Thread Benjamin Peterson
2012/10/4 Antoine Pitrou solip...@pitrou.net: On Thu, 04 Oct 2012 23:46:57 +0200 mar...@v.loewis.de wrote: Zitat von Victor Stinner victor.stin...@gmail.com: I only see one argument against such refactoring: it will be harder to backport/forwardport bugfixes. I'm opposed for a different