Re: Managing Google Groups headaches

2013-12-06 Thread rusi
On Saturday, December 7, 2013 7:54:50 AM UTC+5:30, Ned Batchelder wrote:
> On 12/6/13 8:03 AM, rusi wrote:

> > Leaving aside whose fault this is (very likely buggy google groups),
> > this mojibaking cannot happen if the assumption "All text is ASCII"
> > were to uniformly hold.
> > Of course with unicode also this can be made to not happen, but that
> > is fragile and error-prone.  And that is because ASCII (not extended)
> > is ONE thing in a way that unicode is hopelessly a motley inconsistent
> > variety.

> You seem to be suggesting that we should stick to ASCII.  There are of 
> course languages that need more than just the Latin alphabet.  How would 
> you suggest we support them?  Or maybe I don't understand?

Heh! Yes I guess that can be read into what I was saying.

Practically: I dont see that as an option or that the question of
going back to ASCII even arises.

I was talking more philosophically/historically.

Up until the time of Unix a file for example was a structured
heavy-duty concept motivated by entirely technological considerations:
http://en.wikipedia.org/wiki/Data_set_%28IBM_mainframe%29

By simplifying that into the modern concept of file -- just a stream
of bytes -- and allowing the puns:

  byte string
= char list
= text

some elegant systems could be made with people having 'beautiful thoughts:'

Everything that could be stored anywhere -- core or disk -- being bytes
one could go to the next stage and pass around these bytes between
processes. And so we get the elegant --  pipeline -- beauty of Unix
scripts.

Of course there was a catch (Isn't there always?):

Things that did not fit in with this philosophy -- eg clicks of a mouse,
bits on display -- were modelled badly or not at all.

Not-at-all: CLI
Badly: Monstrosity called X

And this explains some of the cultural kinks of our field:

Unix guys invariably think of CLIs as natural and obvious whereas GUIs
are just wasteful eye-candy.

[Yours truly is one of those old geezers who does not know how to
write a GUI to save his life. Almost normal in the Unix world except
that he's not proud of it]

Windows/Mac people do not suffer these delusions but then they dont think of 
programming as natural or obvious at all.

Ive often been amused at windows folk: They dont think of Word as a program.
Rather docs are things that magically open when clicked :-)

Brings me to the point I was trying to make (got side-tracked by
the failure of a character to roundtrip between me and Roy  -- Im none the 
wiser why)

The ASCII = Text = Unicode (non)equation is a relatively minor point.

The more central point is that humans use and need more than just
words to communicate.  By straitjacketing communication into the thin
channel of text we are severely impoverishing ourselves.

We communicate with systems with programs that are unstructured
text-files even though programs are conceptually highly structured entities.

Likewise we communicate with each other by this obscenely obsolete
textual mode that I am using right now when rich text formats have been
available for decades.

Some of my more detailed writings on this:

http://blog.languager.org/2013/09/poorest-computer-users-are-programmers.html

http://blog.languager.org/2012/10/html-is-why-mess-in-programming-syntax.html
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: ASCII and Unicode [was Re: Managing Google Groups headaches]

2013-12-06 Thread Chris Angelico
On Sat, Dec 7, 2013 at 2:16 PM, rusi  wrote:
> On Saturday, December 7, 2013 8:11:45 AM UTC+5:30, Chris Angelico wrote:
>> On Sat, Dec 7, 2013 at 1:33 PM, rusi  wrote:
>> > That seems to suggest that something is not right with the python
>> > mailing list config. No??
>
>> If in doubt, blame someone else, eh?
>
>> I'd first check what your browser's actually sending. Firebug will
>> help there. See if your form fill-out is encoded as UTF-8 or CP-1252.
>> That's the first step.
>
> If you give me some tip where to look, I'll do that.
> But I dont see what this has to do with forms.
>

Page encodings specify what comes from the server to your browser.
Your post went the other way. Tracing the data going back to the
server would tell you how it's encoded.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: One liners

2013-12-06 Thread Chris Angelico
On Sat, Dec 7, 2013 at 2:27 PM, Roy Smith  wrote:
> --
> extracols = sorted(set.union(*(set(t.data.keys()) for t in tracks))) if
> tracks else []
> --
> c2s = compids2songs(set(targets.keys()) |
> set.union(*map(set,targets.itervalues())),self.docmap,self.logger) if
> targets else {}

Easy rewrites:

extracols = tracks and sorted(set.union(*(set(t.data.keys()) for t in tracks)))

Assumes that tracks is a list, which it most likely is given the
context. Parallel with the other.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: One liners

2013-12-06 Thread Roy Smith
In article ,
 Dan Stromberg  wrote:

> A lot of things people do with regex's, could be done with string methods
> more clearly and concisely.

That is true.  The problem is, there are a lot of things for which regex 
is the right tool, but people get out of practice using them (or never 
learned how) because they gravitate to string methods for most tasks.

It's like any sharp tool.  When skillfully handled, they're excellent at 
the tasks they were designed for.  But, if you don't practice the 
necessary skills, you end up just re-enacting the Black Night Sketch.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Python 2.8 release schedule

2013-12-06 Thread Ethan Furman

On 12/06/2013 06:07 PM, Chris Angelico wrote:

On Sat, Dec 7, 2013 at 1:00 PM, Mark Lawrence  wrote:

On 07/12/2013 01:54, Chris Angelico wrote:


On Sat, Dec 7, 2013 at 12:48 PM, Mark Lawrence 
wrote:

Sorry but I don't get it :)


[explained the joke]


Clearly that went straight over your head.


*facepalm* Yep, it did. Completely missed what you said there.

Doh. I see what you did there... now.


Heh.  It was too subtle for me, too.  'Course, I've been fighting OpenERP all 
day...

--
~Ethan~
--
https://mail.python.org/mailman/listinfo/python-list


Re: Managing Google Groups headaches

2013-12-06 Thread Roy Smith
In article <52a290ed$0$30003$c3e8da3$54964...@news.astraweb.com>,
 Steven D'Aprano  wrote:

> In contrast, that is not the case with nearly all web forums. By 
> deliberate design, or mere ignorance and neglect, they mix up the message 
> you care about ("Hi Bob...") and the stuff you need to get that message 
> (the HTML and Javascript code) in one big ball of mud, and don't have 
> APIs for getting messages.

BTW, I was going to bring up vBulletin as an example of a typical web 
forum which suffers from the "big ball of mud" syndrome.  Then I 
discovered that it does indeed have a reasonable looking API 
(http://www.vbulletin.com/vbcms/content.php/367-API-Overview).

Beautiful Soup is an awesome tool.  Even more awesome is when you don't 
have to use it :-)
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: One liners

2013-12-06 Thread Roy Smith
In article <52a287cb$0$30003$c3e8da3$54964...@news.astraweb.com>,
 Steven D'Aprano  wrote:

> The ternary if is slightly unusual and unfamiliar

It's only unusual an unfamiliar if you're not used to using it :-)  
Coming from a C/C++ background, I always found the lack of a ternary 
expression rather limiting.  There was much rejoicing in these parts 
when it was added to the language relatively recently.  I use them a lot.

On the other hand, I found list comprehensions to be mind-bogglingly 
confusing when I first saw them (read: slightly unusual and unfamiliar).  
It took me a long time to warm up to the concept.  Now I love them.

> As for readability, I accept that ternary if is unusual compared to other 
> languages, but it's still quite readable in small doses. If you start 
> chaining them:
> 
> result = a if condition else b if flag else c if predicate else d 
> 
> you probably shouldn't.

That I agree with (and it's just as true in C as it is in Python).

Just for fun, I took a look through the Songza code base.  66 kloc of 
non-whitespace Python.  I found 192 ternary expressions.  Here's a few 
of the more bizarre ones (none of which I consider remotely readable):

--
extracols = sorted(set.union(*(set(t.data.keys()) for t in tracks))) if 
tracks else []
--
c2s = compids2songs(set(targets.keys()) | 
set.union(*map(set,targets.itervalues())),self.docmap,self.logger) if 
targets else {}
--
code = 2 if (pmp3,paac)==(mmp3,maac) else 3 if any(x is None for x in 
(pmp3,paac,mmp3,maac)) else 4
--

Anybody else have some fun ternary abuse examples?
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: One liners

2013-12-06 Thread Dan Stromberg
On Fri, Dec 6, 2013 at 6:07 PM, Steven D'Aprano <
steve+comp.lang.pyt...@pearwood.info> wrote:

> On Fri, 06 Dec 2013 15:54:22 -0800, Dan Stromberg wrote:
>
> > Does anyone else feel like Python is being dragged too far in the
> > direction of long, complex, multiline one-liners?  Or avoiding temporary
> > variables with descriptive names?  Or using regex's for everything under
> > the sun?
>
> All those things are stylistic issues, not language issues. Yes, I see
> far too many people trying to squeeze three lines of code into one, but
> that's their choice, not the language leading them that way.
>

Yes, stylistic, or even "cultural".


> I refuse to apologise
> for writing the one-liner:
>
> result = [func(item) for item in sequence]
>
> instead of four:
>
> result = []
> for i in range(len(sequence)):
> item = sequence[i]
> result.append(func(item))
>
IMO, this is a time when the one liner is more clear.  But if you start
trying to stretch that to extremes, it becomes worse instead of better.

>
> > What happened to using classes?  What happened to the beautiful emphasis
> > on readability?  What happened to debuggability (which is always harder
> > than writing things in the first place)?  And what happened to string
> > methods?
>
> What about string methods?
>
A lot of things people do with regex's, could be done with string methods
more clearly and concisely.

The beauty of Python is that it is a multi-paradigm language. You can
> write imperative, procedural, functional, OOP, or pipelining style (and
> probably more). The bad thing about Python is that if you're reading
> other people's code you *need* to be familiar with all those styles.
>

That's fine.  That's appropriate.   But I imagine any of these can be done
with the intention of being more clever than clear.

BTW, what's pipelining style?  Like bash?

> I'm pleased to see Python getting more popular, but it feels like a lot
> of newcomers are trying their best to turn Python into Perl or
> something, culturally speaking.

They're probably writing code using the idioms they are used to from
> whatever language they have come from. Newcomers nearly always do this.
> The more newcomers you get, the less Pythonic the code you're going to
> see from them.
>

Nod.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: ASCII and Unicode [was Re: Managing Google Groups headaches]

2013-12-06 Thread rusi
On Saturday, December 7, 2013 8:11:45 AM UTC+5:30, Chris Angelico wrote:
> On Sat, Dec 7, 2013 at 1:33 PM, rusi  wrote:
> > That seems to suggest that something is not right with the python
> > mailing list config. No??

> If in doubt, blame someone else, eh?

> I'd first check what your browser's actually sending. Firebug will
> help there. See if your form fill-out is encoded as UTF-8 or CP-1252.
> That's the first step.

If you give me some tip where to look, I'll do that.
But I dont see what this has to do with forms.

Everything in the python archive (not just my posts) show as Win 1252
[I checked about 6]

Every other page that I checked (most nothing to do with python list,
GG etc) show UTF-8. [I checked about 5]

None of these checkings had forms to be filled.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: ASCII and Unicode [was Re: Managing Google Groups headaches]

2013-12-06 Thread MRAB

On 07/12/2013 02:41, Chris Angelico wrote:

On Sat, Dec 7, 2013 at 1:33 PM, rusi  wrote:

That seems to suggest that something is not right with the python
mailing list config. No??


If in doubt, blame someone else, eh?

I'd first check what your browser's actually sending. Firebug will
help there. See if your form fill-out is encoded as UTF-8 or CP-1252.
That's the first step.


Looking back through the thread, it looks like:

Roy posted a reply in us-ascii.

rusi replied in windows-1252, adding the '…'.

Roy replied in us-ascii, but with 'Š' in place of '…'.

rusi replied in utf-8, with '�' in place of '…'

--
https://mail.python.org/mailman/listinfo/python-list


Re: Managing Google Groups headaches

2013-12-06 Thread Steven D'Aprano
On Thu, 05 Dec 2013 23:13:54 -0800, rusi wrote:

> On Thursday, December 5, 2013 6:28:54 AM UTC+5:30, Roy Smith wrote:

>> The real problem with web forums is they conflate transport and
>> presentation into a single opaque blob, and are pretty much universally
>> designed to be a closed system.  Mail and usenet were both engineered
>> to make a sharp division between transport and presentation, which
>> meant it was possible to evolve each at their own pace.
> 
>> Mostly that meant people could go off and develop new client
>> applications which interoperated with the existing system.  But, it
>> also meant that transport layers could be switched out (as when NNTP
>> gradually, but inexorably, replaced UUCP as the primary usenet
>> transport layer).
> 
> There is a deep assumption hovering round-about the above -- what I will
> call the 'Unix assumption(s)'.  But before that, just a check on
> terminology. By 'presentation' you mean what people normally call
> 'mail-clients': thunderbird, mutt etc. And by 'transport' you mean
> sendmail, exim, qmail etc etc -- what normally are called
> 'mail-servers.'  Right??

Presentation means how the data is presented. Transport means how the 
data is transported. It doesn't refer to a specific piece of software 
like Thunderbird, but to the logical fact that what people see (the 
presentation) is not identical to what gets transported from one computer 
to another.

All programs make *some* distinction between the two. Email is encoded, 
wrapped with normally-hidden headers, and then sent, before being 
displayed at the other end sans such headers. But some programs make a 
nice clean distinction. If your mail client converts emails to sound for 
the benefit of the blind, that is easy to do because there is a clean 
*and public* distinction between the transport and presentation of email 
-- everybody can agree on how to extract the message ("Hi Bob, are we 
still meeting up for drinks tomorrow night?") from the transportation 
layer (the email envelope).

In contrast, that is not the case with nearly all web forums. By 
deliberate design, or mere ignorance and neglect, they mix up the message 
you care about ("Hi Bob...") and the stuff you need to get that message 
(the HTML and Javascript code) in one big ball of mud, and don't have 
APIs for getting messages. Or worse, they deliberate obfuscate the 
content, in an attempt to lock people in to only using the specific 
interface they want you to use.

Consider the difference between (say) Twitter, which has published 
standard APIs for reading and writing tweets, and StackOverflow, which as 
far as I can tell insists that the one and only way to read and write 
comments is via their website. The internal formatting of the website is 
not public and is subject to change without notice.

(If I have unfairly maligned StackOverflow, substitute any number of 
dozens or hundreds of web forums.) 


[...]
> To the extent that these assumptions are invalid, the 'opaque-blob' may
> well be preferable.

No. Nice clean interfaces separating concerns (such as transport and 
presentation) have little to do with ASCII text. One can define clear and 
open binary protocols too.



-- 
Steven
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: ASCII and Unicode [was Re: Managing Google Groups headaches]

2013-12-06 Thread Chris Angelico
On Sat, Dec 7, 2013 at 1:33 PM, rusi  wrote:
> That seems to suggest that something is not right with the python
> mailing list config. No??

If in doubt, blame someone else, eh?

I'd first check what your browser's actually sending. Firebug will
help there. See if your form fill-out is encoded as UTF-8 or CP-1252.
That's the first step.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: One liners

2013-12-06 Thread Chris Angelico
On Sat, Dec 7, 2013 at 1:28 PM, Steven D'Aprano
 wrote:
> As for readability, I accept that ternary if is unusual compared to other
> languages...

All the C-derived ternary operators put the condition first, but
Python puts the condition in the middle. What that does for
readability I don't really know. Which is more important?

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: ASCII and Unicode [was Re: Managing Google Groups headaches]

2013-12-06 Thread rusi
On Saturday, December 7, 2013 12:30:18 AM UTC+5:30, Steven D'Aprano wrote:
> On Fri, 06 Dec 2013 05:03:57 -0800, rusi wrote:

> > Evidently (and completely inadvertently) this exchange has just
> > illustrated one of the inadmissable assumptions:
> > "unicode as a medium is universal in the same way that ASCII used to be"

> Ironically, your post was not Unicode.

> Seriously. I am 100% serious.

> Your post was sent using a legacy encoding, Windows-1252, also known as 
> CP-1252, which is most certainly *not* Unicode. Whatever software you 
> used to send the message correctly flagged it with a charset header:

> Content-Type: text/plain; charset=windows-1252

> Alas, the software Roy Smith uses, MT-NewsWatcher, does not handle 
> encodings correctly (or at all!), it screws up the encoding then sends a 
> reply with no charset line at all. This is one bug that cannot be blamed 
> on Google Groups -- or on Unicode.

> > I wrote a number of ellipsis characters ie codepoint 2026 as in:

> Actually you didn't. You wrote a number of ellipsis characters, hex byte 
> \x85 (decimal 133), in the CP1252 charset. That happens to be mapped to 
> code point U+2026 in Unicode, but the two are as distinct as ASCII and 
> EBCDIC.

> > Somewhere between my sending and your quoting those ellipses became the
> > replacement character FFFD

> Yes, it appears that MT-NewsWatcher is *deeply, deeply* confused about 
> encodings and character sets. It doesn't just assume things are ASCII, 
> but makes a half-hearted attempt to be charset-aware, but badly. I can 
> only imagine that it was written back in the Dark Ages where there were a 
> lot of different charsets in use but no conventions for specifying which 
> charset was in use. Or perhaps the author was smoking crack while coding.

> > Leaving aside whose fault this is (very likely buggy google groups),
> > this mojibaking cannot happen if the assumption "All text is ASCII" were
> > to uniformly hold.

> This is incorrect. People forget that ASCII has evolved since the first 
> version of the standard in 1963. There have actually been five versions 
> of the ASCII standard, plus one unpublished version. (And that's not 
> including the things which are frequently called ASCII but aren't.)

> ASCII-1963 didn't even include lowercase letters. It is also missing some 
> graphic characters like braces, and included at least two characters no 
> longer used, the up-arrow and left-arrow. The control characters were 
> also significantly different from today.

> ASCII-1965 was unpublished and unused. I don't know the details of what 
> it changed.

> ASCII-1967 is a lot closer to the ASCII in use today. It made 
> considerable changes to the control characters, moving, adding, removing, 
> or renaming at least half a dozen control characters. It officially added 
> lowercase letters, braces, and some others. It replaced the up-arrow 
> character with the caret and the left-arrow with the underscore. It was 
> ambiguous, allowing variations and substitutions, e.g.:

> - character 33 was permitted to be either the exclamation 
>   mark ! or the logical OR symbol |

> - consequently character 124 (vertical bar) was always 
>   displayed as a broken bar ¦, which explains why even today
>   many keyboards show it that way

> - character 35 was permitted to be either the number sign # or 
>   the pound sign £

> - character 94 could be either a caret ^ or a logical NOT ¬

> Even the humble comma could be pressed into service as a cedilla.

> ASCII-1968 didn't change any characters, but allowed the use of LF on its 
> own. Previously, you had to use either LF/CR or CR/LF as newline.

> ASCII-1977 removed the ambiguities from the 1967 standard.

> The most recent version is ASCII-1986 (also known as ANSI X3.4-1986). 
> Unfortunately I haven't been able to find out what changes were made -- I 
> presume they were minor, and didn't affect the character set.

> So as you can see, even with actual ASCII, you can have mojibake. It's 
> just not normally called that. But if you are given an arbitrary ASCII 
> file of unknown age, containing code 94, how can you be sure it was 
> intended as a caret rather than a logical NOT symbol? You can't.

> Then there are at least 30 official variations of ASCII, strictly 
> speaking part of ISO-646. These 7-bit codes were commonly called "ASCII" 
> by their users, despite the differences, e.g. replacing the dollar sign $ 
> with the international currency sign ¤, or replacing the left brace 
> { with the letter s with caron š.

> One consequence of this is that the MIME type for ASCII text is called 
> "US ASCII", despite the redundancy, because many people expect "ASCII" 
> alone to mean whatever national variation they are used to.

> But it gets worse: there are proprietary variations on ASCII which are 
> commonly called "ASCII" but aren't, including dozens of 8-bit so-called 
> "extended ASCII" character sets, which i

Re: One liners

2013-12-06 Thread Steven D'Aprano
On Fri, 06 Dec 2013 17:20:27 -0700, Michael Torrie wrote:

> On 12/06/2013 05:14 PM, Dan Stromberg wrote:
>> I'm thinking mostly of stackoverflow, but here's an example I ran into
>> (a lot of) on a job:
>> 
>> somevar = some_complicated_thing(somevar) if
>> some_other_complicated_thing(somevar) else somevar
>> 
>> Would it really be so bad to just use an if statement?  Why are we
>> assigning somevar to itself?  This sort of thing was strewn across 3 or
>> 4 physical lines at a time.

Unless you're embedding it in another statement, there's no advantage to 
using the ternary if operator if the clauses are so large you have to 
split the line over two or more lines in the first place. I agree that:

result = (spam(x) + eggs(x) + toast(x) 
  if x and condition(x) or another_condition(x)
  else foo(x) + bar(x) + foobar(x))
 
is probably better written as:

if x and condition(x) or another_condition(x):
result = spam(x) + eggs(x) + toast(x)
else:
result = foo(x) + bar(x) + foobar(x)


The ternary if is slightly unusual and unfamiliar, and is best left for 
when you need an expression:

ingredients = [spam, eggs, cheese, toast if flag else bread, tomato]


As for your second complaint, "why are we assigning somevar to itself", I 
see nothing wrong with that. Better that than a plethora of variables 
used only once:


# Screw this for a game of soldiers.
def function(arg, param_as_list_or_string):
if isinstance(param_as_list_or_string, str):
param = param_as_list_or_string.split()
else:
param = param_as_list_or_string


# Better.
def function(arg, param):
if isinstance(param, str):
param = param.split()


"Replace x with a transformed version of x" is a perfectly legitimate 
technique, and not one which ought to be too hard to follow.


> You're right that a conventional "if" block is not only more readable,
> but also faster and more efficient code.

Really? I don't think so. This is using Python 2.7:


[steve@ando ~]$ python -m timeit --setup="flag = 0" \
> "if flag: y=1
> else: y=2"
1000 loops, best of 3: 0.0836 usec per loop

[steve@ando ~]$ python -m timeit --setup="flag = 0" "y = 1 if flag else 2"
1000 loops, best of 3: 0.0813 usec per loop


There's practically nothing between the two, but the ternary if operator 
is marginally faster.

As for readability, I accept that ternary if is unusual compared to other 
languages, but it's still quite readable in small doses. If you start 
chaining them:

result = a if condition else b if flag else c if predicate else d 

you probably shouldn't.


-- 
Steven
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Managing Google Groups headaches

2013-12-06 Thread Ned Batchelder

On 12/6/13 8:03 AM, rusi wrote:

I think you're off on the wrong track here.  This has nothing to do with
>plain text (ascii or otherwise).  It has to do with divorcing how you
>store and transport messages (be they plain text, HTML, or whatever)
>from how a user interacts with them.


Evidently (and completely inadvertently) this exchange has just
illustrated one of the inadmissable assumptions:

"unicode as a medium is universal in the same way that ASCII used to be"

I wrote a number of ellipsis characters ie codepoint 2026 as in:

   - human communication…
(is not very different from)
   - machine communication…

Somewhere between my sending and your quoting those ellipses became
the replacement character FFFD


> >   - human communication�
> >(is not very different from)
> >   - machine communication�

Leaving aside whose fault this is (very likely buggy google groups),
this mojibaking cannot happen if the assumption "All text is ASCII"
were to uniformly hold.

Of course with unicode also this can be made to not happen, but that
is fragile and error-prone.  And that is because ASCII (not extended)
is ONE thing in a way that unicode is hopelessly a motley inconsistent
variety.


You seem to be suggesting that we should stick to ASCII.  There are of 
course languages that need more than just the Latin alphabet.  How would 
you suggest we support them?  Or maybe I don't understand?


--Ned.

--
https://mail.python.org/mailman/listinfo/python-list


Re: One liners

2013-12-06 Thread Steven D'Aprano
On Fri, 06 Dec 2013 15:54:22 -0800, Dan Stromberg wrote:

> Does anyone else feel like Python is being dragged too far in the
> direction of long, complex, multiline one-liners?  Or avoiding temporary
> variables with descriptive names?  Or using regex's for everything under
> the sun?

All those things are stylistic issues, not language issues. Yes, I see 
far too many people trying to squeeze three lines of code into one, but 
that's their choice, not the language leading them that way.

On the other hand, Python code style is influenced strongly by functional 
languages like Lisp, Scheme and Haskell (despite the radically different 
syntax). Python has even been described approvingly as "Lisp without the 
brackets". To somebody coming from a C or Pascal procedural background, 
or a Java OOP background, such functional-style code might seem too 
concise and/or weird. But frankly, I think that such programmers would 
write better code with a more functional approach. I refuse to apologise 
for writing the one-liner:

result = [func(item) for item in sequence]

instead of four:

result = []
for i in range(len(sequence)):
item = sequence[i]
result.append(func(item))


> What happened to using classes?  What happened to the beautiful emphasis
> on readability?  What happened to debuggability (which is always harder
> than writing things in the first place)?  And what happened to string
> methods?

What about string methods?

As far as classes go, I find that they're nearly always overkill. Most of 
the time, a handful of pre-written standard classes, like dict, list, 
namedtuple and the like, get me 90% of the way to where I need to go.

The beauty of Python is that it is a multi-paradigm language. You can 
write imperative, procedural, functional, OOP, or pipelining style (and 
probably more). The bad thing about Python is that if you're reading 
other people's code you *need* to be familiar with all those styles.


> I'm pleased to see Python getting more popular, but it feels like a lot
> of newcomers are trying their best to turn Python into Perl or
> something, culturally speaking.

They're probably writing code using the idioms they are used to from 
whatever language they have come from. Newcomers nearly always do this. 
The more newcomers you get, the less Pythonic the code you're going to 
see from them.


-- 
Steven
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Python 2.8 release schedule

2013-12-06 Thread Chris Angelico
On Sat, Dec 7, 2013 at 1:00 PM, Mark Lawrence  wrote:
> On 07/12/2013 01:54, Chris Angelico wrote:
>>
>> On Sat, Dec 7, 2013 at 12:48 PM, Mark Lawrence 
>> wrote:
>>> Sorry but I don't get it :)
>>
>> [explained the joke]
>
> Clearly that went straight over your head.

*facepalm* Yep, it did. Completely missed what you said there.

Doh. I see what you did there... now.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Python 2.8 release schedule

2013-12-06 Thread Mark Lawrence

On 07/12/2013 01:54, Chris Angelico wrote:

On Sat, Dec 7, 2013 at 12:48 PM, Mark Lawrence  wrote:

On 07/12/2013 01:39, Terry Reedy wrote:


On 12/6/2013 4:26 PM, Mark Lawrence wrote:


My apologies if you've seen this before but here is the official
schedule http://www.python.org/dev/peps/pep-0404/



The PEP number is not an accident ;-).



Sorry but I don't get it :)


HTTP error 404 "Not Found", probably the most famous (though not the
most common) HTTP return code.

You asked for Python 2.8? Sorry, not found... it's 404.

ChrisA



Clearly that went straight over your head.

--
My fellow Pythonistas, ask not what our language can do for you, ask 
what you can do for our language.


Mark Lawrence

--
https://mail.python.org/mailman/listinfo/python-list


Re: Python 2.8 release schedule

2013-12-06 Thread Chris Angelico
On Sat, Dec 7, 2013 at 12:48 PM, Mark Lawrence  wrote:
> On 07/12/2013 01:39, Terry Reedy wrote:
>>
>> On 12/6/2013 4:26 PM, Mark Lawrence wrote:
>>>
>>> My apologies if you've seen this before but here is the official
>>> schedule http://www.python.org/dev/peps/pep-0404/
>>
>>
>> The PEP number is not an accident ;-).
>
>
> Sorry but I don't get it :)

HTTP error 404 "Not Found", probably the most famous (though not the
most common) HTTP return code.

You asked for Python 2.8? Sorry, not found... it's 404.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Python 2.8 release schedule

2013-12-06 Thread Mark Lawrence

On 07/12/2013 01:39, Terry Reedy wrote:

On 12/6/2013 4:26 PM, Mark Lawrence wrote:

My apologies if you've seen this before but here is the official
schedule http://www.python.org/dev/peps/pep-0404/


The PEP number is not an accident ;-).


Sorry but I don't get it :)

--
My fellow Pythonistas, ask not what our language can do for you, ask 
what you can do for our language.


Mark Lawrence

--
https://mail.python.org/mailman/listinfo/python-list


Re: Python 2.8 release schedule

2013-12-06 Thread Terry Reedy

On 12/6/2013 4:26 PM, Mark Lawrence wrote:

My apologies if you've seen this before but here is the official
schedule http://www.python.org/dev/peps/pep-0404/


The PEP number is not an accident ;-).
--
Terry Jan Reedy

--
https://mail.python.org/mailman/listinfo/python-list


Re: interactive help on the base object

2013-12-06 Thread Terry Reedy

On 12/6/2013 12:03 PM, Mark Lawrence wrote:

Is it just me, or is this basically useless?

 >>> help(object)
Help on class object in module builtins:

class object
  |  The most base type


Given that this can be interpreted as 'least desirable', it could 
definitely be improved.



Surely a few more words,


How about something like.

'''The default top superclass for all Python classes.

Its methods are inherited by all classes unless overriden.
'''

When you have 1 or more concrete suggestions for the docstring, open a 
tracker issue.


> or a pointer to this

http://docs.python.org/3/library/functions.html#object, would be better?


URLs don't belong in docstrings. People should know how to find things 
in the manual index.


--
Terry Jan Reedy

--
https://mail.python.org/mailman/listinfo/python-list


Re: One liners

2013-12-06 Thread Roy Smith
In article ,
 Joel Goldstick  wrote:

> Aside from django urls, I am not sure I ever wrote regexes in python.  For
> some reason they must seem awfully sexy to quite a few people.  Back to my
> point above -- ever try to figure out a complicated regex written by
> someone else?

Regex has a bad rap in the Python community.  To be sure, you can abuse 
them, and write horrible monstrosities.  On the other hand, stuff like 
this (slightly reformatted for posting):

pattern = re.compile(
r'haproxy\[(?P\d+)]: '
r'(?P(\d{1,3}\.){3}\d{1,3}):'
r'(?P\d{1,5}) '
r'\[(?P\d{2}/\w{3}/\d{4}(:\d{2}){3}\.\d{3})] '
r'(?P\S+) '
r'(?P\S+)/'
r'(?P\S+) '
r'(?P(-1|\d+))/'
r'(?P(-1|\d+))/'
r'(?P(-1|\d+))/'
r'(?P(-1|\d+))/'
r'(?P\+?\d+) '
r'(?P\d{3}) '
r'(?P\d+) '
r'(?P\S+) '
r'(?P\S+) '
r'(?P[\w-]{4}) '
r'(?P\d+)/'
r'(?P\d+)/'
r'(?P\d+)/'
r'(?P\d+)/'
r'(?P\d+) '
r'(?P\d+)/'
r'(?P\d+) '
r'(\{(?P.*?)\} )?'   # Comment out for stock haproxy
r'(\{(?P.*?)\} )?'
r'(\{(?P.*?)\} )?'
r'"(?P.+)"'
)

while intimidating at first glance, really isn't that hard to 
understand.  Python's raw string literals, adjacent string literal 
catenation, and automatic line continuation team up to eliminate a lot 
of extra fluff.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Embedding multiple interpreters

2013-12-06 Thread Garthy


Hi Gregory,

On 07/12/13 08:39, Gregory Ewing wrote:
> Garthy wrote:
>> To allow each script to run in its own environment, with minimal
>> chance of inadvertent interaction between the environments, whilst
>> allowing each script the ability to stall on conditions that will be
>> later met by another thread supplying the information, and to fit in
>> with existing infrastructure.
>
> The last time I remember this being discussed was in the context
> of allowing free threading. Multiple interpreters don't solve
> that problem, because there's still only one GIL and some
> objects are shared.

I am fortunate in my case as the normal impact of the GIL would be much 
reduced. The common case is only one script actively progressing at a 
time- with the others either not running or waiting for external input 
to continue.


But as you point out in your other reply, there are still potential 
concerns that arise from the smaller set of shared objects even across 
interpreters.


> But if all you want is for each plugin to have its own version
> of sys.modules, etc., and you're not concerned about malicious
> code, then it may be good enough.

I wouldn't say that I wasn't concerned about it entirely, but on the 
other hand it is not a hard requirement to which all other concerns are 
secondary.


Cheers,
Garth
--
https://mail.python.org/mailman/listinfo/python-list


Re: Eliminate "extra" variable

2013-12-06 Thread Ethan Furman

On 12/06/2013 03:38 PM, Joel Goldstick wrote:

On Fri, Dec 6, 2013 at 2:37 PM, Igor Korot wrote:


def MyFunc(self, originalData):
  data = {}
  dateStrs = []
  for i in xrange(0, len(originalData)):
dateStr, freq, source = originalData[i]
data[str(dateStr)]  = {source: freq}

# above line confuses me!

dateStrs.append(dateStr)
 for i in xrange(0, len(dateStrs) - 1):
   currDateStr = str(dateStrs[i])
   nextDateStrs = str(dateStrs[i + 1])


Python lets you iterate over a list directly, so :

 for d in originalData:
 dateStr, freq, source = d
 data[source] = freq


You could shorten that to

   for dateStr, freq, source in originalData:

and if dateStr is already a string:

   data[dateStr] = {source: freq}


Your code looks like you come from a c background.  Python idioms are different


Agreed.



I'm not sure what you are trying to do in the second for loop, but I think you 
are trying to iterate thru a dictionary
in a certain order, and you can't depend on the order


The second loop is iterating over the list dateStrs.

--
~Ethan~
--
https://mail.python.org/mailman/listinfo/python-list


Re: Why is there no natural syntax for accessing attributes with names not being valid identifiers?

2013-12-06 Thread Rotwang

On 06/12/2013 16:51, Piotr Dobrogost wrote:

[...]

I thought of that argument later the next day. Your proposal does
unify access if the old obj.x syntax is removed.


As long as obj.x is a very concise way to get attribute named 'x' from
object obj it's somehow odd that identifier x is treated not like
identifier but like string literal 'x'. If it were treated like an
identifier then we would get attribute with name being value of x
instead attribute named 'x'. Making it possible to use string literals
in the form obj.'x' as proposed this would make getattr basically
needless as long as we use only variable not expression to denote
attribute's name.


But then every time you wanted to get an attribute with a name known at 
compile time you'd need to write obj.'x' instead of obj.x, thereby 
requiring two additional keystrokes. Given that the large majority of 
attribute access Python code uses dot syntax rather than getattr, this 
seems like it would massively outweigh the eleven keystrokes one saves 
by writing obj.'x' instead of getattr(obj,'x').


--
https://mail.python.org/mailman/listinfo/python-list


Re: One liners

2013-12-06 Thread Joel Goldstick
On Fri, Dec 6, 2013 at 7:20 PM, Michael Torrie  wrote:

> On 12/06/2013 05:14 PM, Dan Stromberg wrote:
> > I'm thinking mostly of stackoverflow, but here's an example I ran into (a
> > lot of) on a job:
> >
> > somevar = some_complicated_thing(somevar) if
> > some_other_complicated_thing(somevar) else somevar
> >
> > Would it really be so bad to just use an if statement?  Why are we
> > assigning somevar to itself?  This sort of thing was strewn across 3 or 4
> > physical lines at a time.
>
> You're right that a conventional "if" block is not only more readable,
> but also faster and more efficient code.  Sorry you have to deal with
> code written like that!  That'd frustrate any sane programmer.  It might
> bother me enough to write code to reformat the program to convert that
> style to something sane!  There are times when the ternary (did I get
> that right?) operator is useful and clear.
> --
> https://mail.python.org/mailman/listinfo/python-list
>

While it seems to be a higher status in the team to write new code as
compared to fixing old code, so much can be learned by having to plough
through old code.  To learn others coding style, pick up new understanding,
and most importantly totally disabuse yourself of trying to be cute with
code.  Code is read by the machine and by the programmer.  The programmer
is the one who should be deferred to, imo.  You buy the machine, you rent
the programmer by the hour!

Aside from django urls, I am not sure I ever wrote regexes in python.  For
some reason they must seem awfully sexy to quite a few people.  Back to my
point above -- ever try to figure out a complicated regex written by
someone else?



-- 
Joel Goldstick
http://joelgoldstick.com
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Eliminate "extra" variable

2013-12-06 Thread Tim Chase
On 2013-12-06 11:37, Igor Korot wrote:
> def MyFunc(self, originalData):
>  data = {}
>  for i in xrange(0, len(originalData)):
>dateStr, freq, source = originalData[i]
>data[str(dateStr)]  = {source: freq}

this can be more cleanly/pythonically written as

  def my_func(self, original_data):
for date, freq, source in original_data
  data[str(date)] = {source: freq}

or even just

data = dict(
  (str(date), {source: freq})
  for date, freq, source in original_data
  )

You're calling it a "dateStr", which suggests that it's already a
string, so I'm not sure why you're str()'ing it.  So I'd either just
call it "date", or skip the str(date) bit if it's already a string.
That said, do you even need to convert it to a string (as
datetime.date objects can be used as keys in dictionaries)?

> for i in xrange(0, len(dateStrs) - 1):
>   currDateStr = str(dateStrs[i])
>   nextDateStrs = str(dateStrs[i + 1])
> 
> It seems very strange that I need the dateStrs list just for the
> purpose of looping thru the dictionary keys.
> Can I get rid of the "dateStrs" variable?

Your code isn't actually using the data-dict at this point.  If you
were doing something with it, it might help to know what you want to
do.

Well, you can iterate over the original data, zipping them together:

  for (cur, _, _), (next, _, _) in zip(
  original_data[:-1],
  original_data[1:]
  ):
do_something(cur, next)

If your purpose for the "data" dict is to merely look up stats from
the next one, the whole batch of your original code can be replaced
with:

  for (
(cur_dt, cur_freq, cur_source),
(next_dt, next_freq, next_source)
) in zip(original_data[:-1], original_data[1:]):
# might need to do str(cur_dt) and str(next_dt) instead?
do_things_with(cur_dt, cur_freq, cur_source,
  next_dt, next_freq, next_source)

That eliminates the dict *and* the extra variable name. :-)

-tkc




-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Eliminate "extra" variable

2013-12-06 Thread Joel Goldstick
On Fri, Dec 6, 2013 at 7:16 PM, Roy Smith  wrote:

> In article ,
>  Joel Goldstick  wrote:
>
> > Python lets you iterate over a list directly, so :
> >
> > for d in originalData:
> > dateStr, freq, source = d
> > data[source] = freq
>
> I would make it even simpler:
>
> > for dateStr, freq, source in originalData:
> > data[source] = freq
>


+1 --- I agree

To the OP:

Could you add a docstring to your function to explain what is supposed to
happen, describe the input and output?  If you do that I'm sure you could
get some more complete help with your code.

> --
> https://mail.python.org/mailman/listinfo/python-list
>



-- 
Joel Goldstick
http://joelgoldstick.com
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Embedding multiple interpreters

2013-12-06 Thread Garthy


Hi Gregory,

On 07/12/13 08:53, Gregory Ewing wrote:
> Garthy wrote:
>> The bare minimum would be protection against inadvertent interaction.
>> Better yet would be a setup that made such interaction annoyingly
>> difficult, and the ideal would be where it was impossible to interfere.
>
> To give you an idea of the kind of interference that's
> possible, consider:
>
> 1) You can find all the subclasses of a given class
> object using its __subclasses__() method.
>
> 2) Every class ultimately derives from class object.
>
> 3) All built-in class objects are shared between
> interpreters.
>
> So, starting from object.__subclasses__(), code in any
> interpreter could find any class defined by any other
> interpreter and mutate it.

Many thanks for the excellent example. It was not clear to me how 
readily such a small and critical bit of shared state could potentially 
be abused across interpreter boundaries. I am guessing this would be the 
first in a chain of potential problems I may run into.


> This is not something that is likely to happen by
> accident. Whether it's "annoyingly difficult" enough
> is something you'll have to decide.

I think it'd fall under "protection against inadvertent modification"- 
down the scale somewhat. It doesn't sound like it would be too difficult 
to achieve if the author was so inclined.


> Also keep in mind that it's fairly easy for Python
> code to chew up large amounts of memory and/or CPU
> time in an uninterruptible way, e.g. by
> evaluating 5**1. So even a thread that's
> keeping its hands entirely to itself can still
> cause trouble.

Thanks for the tip. The potential for deliberate resource exhaustion is 
unfortunately something that I am likely going to have to put up with in 
order to keep things in the same process.


Cheers,
Garth
--
https://mail.python.org/mailman/listinfo/python-list


Re: One liners

2013-12-06 Thread Michael Torrie
On 12/06/2013 05:14 PM, Dan Stromberg wrote:
> I'm thinking mostly of stackoverflow, but here's an example I ran into (a
> lot of) on a job:
> 
> somevar = some_complicated_thing(somevar) if
> some_other_complicated_thing(somevar) else somevar
> 
> Would it really be so bad to just use an if statement?  Why are we
> assigning somevar to itself?  This sort of thing was strewn across 3 or 4
> physical lines at a time.

You're right that a conventional "if" block is not only more readable,
but also faster and more efficient code.  Sorry you have to deal with
code written like that!  That'd frustrate any sane programmer.  It might
bother me enough to write code to reformat the program to convert that
style to something sane!  There are times when the ternary (did I get
that right?) operator is useful and clear.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Eliminate "extra" variable

2013-12-06 Thread Roy Smith
In article ,
 Joel Goldstick  wrote:

> Python lets you iterate over a list directly, so :
> 
> for d in originalData:
> dateStr, freq, source = d
> data[source] = freq

I would make it even simpler:

> for dateStr, freq, source in originalData:
> data[source] = freq
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: One liners

2013-12-06 Thread Dan Stromberg
On Fri, Dec 6, 2013 at 4:10 PM, Michael Torrie  wrote:

> On 12/06/2013 04:54 PM, Dan Stromberg wrote:
> > Does anyone else feel like Python is being dragged too far in the
> direction
> > of long, complex, multiline one-liners?  Or avoiding temporary variables
> > with descriptive names?  Or using regex's for everything under the sun?
> >
> > What happened to using classes?  What happened to the beautiful emphasis
> on
> > readability?  What happened to debuggability (which is always harder than
> > writing things in the first place)?  And what happened to string methods?
> >
> > I'm pleased to see Python getting more popular, but it feels like a lot
> of
> > newcomers are trying their best to turn Python into Perl or something,
> > culturally speaking.
>
> I have not seen any evidence that this trend of yours is widespread.
> The Python code I come across seems pretty normal to me.  Expressive and
> readable.  Haven't seen any attempt to turn Python into Perl or that
> sort of thing.  And I don't see that culture expressed on the list.
> Maybe I'm just blind...


I'm thinking mostly of stackoverflow, but here's an example I ran into (a
lot of) on a job:

somevar = some_complicated_thing(somevar) if
some_other_complicated_thing(somevar) else somevar

Would it really be so bad to just use an if statement?  Why are we
assigning somevar to itself?  This sort of thing was strewn across 3 or 4
physical lines at a time.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: One liners

2013-12-06 Thread Michael Torrie
On 12/06/2013 04:54 PM, Dan Stromberg wrote:
> Does anyone else feel like Python is being dragged too far in the direction
> of long, complex, multiline one-liners?  Or avoiding temporary variables
> with descriptive names?  Or using regex's for everything under the sun?
> 
> What happened to using classes?  What happened to the beautiful emphasis on
> readability?  What happened to debuggability (which is always harder than
> writing things in the first place)?  And what happened to string methods?
> 
> I'm pleased to see Python getting more popular, but it feels like a lot of
> newcomers are trying their best to turn Python into Perl or something,
> culturally speaking.

I have not seen any evidence that this trend of yours is widespread.
The Python code I come across seems pretty normal to me.  Expressive and
readable.  Haven't seen any attempt to turn Python into Perl or that
sort of thing.  And I don't see that culture expressed on the list.
Maybe I'm just blind...


-- 
https://mail.python.org/mailman/listinfo/python-list


Re: One liners

2013-12-06 Thread Ned Batchelder

On 12/6/13 6:54 PM, Dan Stromberg wrote:


Does anyone else feel like Python is being dragged too far in the
direction of long, complex, multiline one-liners?  Or avoiding temporary
variables with descriptive names?  Or using regex's for everything under
the sun?

What happened to using classes?  What happened to the beautiful emphasis
on readability?  What happened to debuggability (which is always harder
than writing things in the first place)?  And what happened to string
methods?

I'm pleased to see Python getting more popular, but it feels like a lot
of newcomers are trying their best to turn Python into Perl or
something, culturally speaking.


I agree with you that those trends would be bad.  But I'm not sure how 
you are judging that "Python" is being dragged in that direction?  It's 
a huge community.  Sure some people are obsessed with fewer lines, and 
micro-optimizations, and other newb mistakes, but there are good people too!


--Ned, ever the optimist.


--
https://mail.python.org/mailman/listinfo/python-list


Re: using ffmpeg command line with python's subprocess module

2013-12-06 Thread Gregory Ewing

rusi wrote:

On Friday, December 6, 2013 10:11:04 PM UTC+5:30, MRAB wrote:


You're exaggerating. It's more like 500 years ago. :-)


I was going to say the same until I noticed the "the way people think English
was spoken..."

That makes it unarguable -- surely there are some people who (wrongly) think so?


Probably. They're surprisingly far off, though. Here's
a sample of actual 1000-year-old English:

http://answers.yahoo.com/question/index?qid=20100314001840AAygUaq

--
Greg
--
https://mail.python.org/mailman/listinfo/python-list


One liners

2013-12-06 Thread Dan Stromberg
Does anyone else feel like Python is being dragged too far in the direction
of long, complex, multiline one-liners?  Or avoiding temporary variables
with descriptive names?  Or using regex's for everything under the sun?

What happened to using classes?  What happened to the beautiful emphasis on
readability?  What happened to debuggability (which is always harder than
writing things in the first place)?  And what happened to string methods?

I'm pleased to see Python getting more popular, but it feels like a lot of
newcomers are trying their best to turn Python into Perl or something,
culturally speaking.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: ASCII and Unicode [was Re: Managing Google Groups headaches]

2013-12-06 Thread Chris Angelico
On Sat, Dec 7, 2013 at 6:00 AM, Steven D'Aprano
 wrote:
> - character 33 was permitted to be either the exclamation
>   mark ! or the logical OR symbol |
>
> - consequently character 124 (vertical bar) was always
>   displayed as a broken bar ¦, which explains why even today
>   many keyboards show it that way
>
> - character 35 was permitted to be either the number sign # or
>   the pound sign £
>
> - character 94 could be either a caret ^ or a logical NOT ¬

Yeah, good fun stuff. I first met several of these ambiguities in the
OS/2 REXX documentation, which detailed the language's operators by
specifying their byte values as well as their characters - for
instance, this quote from the docs (yeah, I still have it all here):

"""
Note:   Depending upon your Personal System keyboard and the code page
you are using, you may not have the solid vertical bar to select. For
this reason, REXX also recognizes the use of the split vertical bar as
a logical OR symbol. Some keyboards may have both characters. If so,
they are not interchangeable; only the character that is equal to the
ASCII value of 124 works as the logical OR. This type of mismatch can
also cause the character on your screen to be different from the
character on your keyboard.
"""
(The front material on the docs says "(C) Copyright IBM Corp. 1987,
1994. All Rights Reserved.")

It says "ASCII value" where on this list we would be more likely to
call it "byte value", and I'd prefer to say "represented by" rather
than "equal to", but nonetheless, this is still clearly distinguishing
characters and bytes. The language spec is on characters, but
ultimately the interpreter is going to be looking at bytes, so when
there's a problem, it's byte 124 that's the one defined as logical OR.
Oh, and note the copyright date. The byte/char distinction isn't new.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Eliminate "extra" variable

2013-12-06 Thread Joel Goldstick
On Fri, Dec 6, 2013 at 2:37 PM, Igor Korot  wrote:

> Hi, ALL,
> I have following code:
>
> def MyFunc(self, originalData):
>  data = {}
>  dateStrs = []
>  for i in xrange(0, len(originalData)):
>dateStr, freq, source = originalData[i]
>data[str(dateStr)]  = {source: freq}
>
   # above line confuses me!


>dateStrs.append(dateStr)
> for i in xrange(0, len(dateStrs) - 1):
>   currDateStr = str(dateStrs[i])
>   nextDateStrs = str(dateStrs[i + 1])
>
>
Python lets you iterate over a list directly, so :

for d in originalData:
dateStr, freq, source = d
data[source] = freq

Your code looks like you come from a c background.  Python idioms are
different

I'm not sure what you are trying to do in the second for loop, but I think
you are trying to iterate thru a dictionary in a certain order, and you
can't depend on the order

>
> It seems very strange that I need the dateStrs list just for the
> purpose of looping thru the dictionary keys.
> Can I get rid of the "dateStrs" variable?
>
> Thank you.
> --
> https://mail.python.org/mailman/listinfo/python-list
>



-- 
Joel Goldstick
http://joelgoldstick.com
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Eliminate "extra" variable

2013-12-06 Thread Gary Herron

On 12/06/2013 11:37 AM, Igor Korot wrote:

Hi, ALL,
I have following code:

def MyFunc(self, originalData):
  data = {}
  dateStrs = []
  for i in xrange(0, len(originalData)):
dateStr, freq, source = originalData[i]
data[str(dateStr)]  = {source: freq}
dateStrs.append(dateStr)
 for i in xrange(0, len(dateStrs) - 1):
   currDateStr = str(dateStrs[i])
   nextDateStrs = str(dateStrs[i + 1])


It seems very strange that I need the dateStrs list just for the
purpose of looping thru the dictionary keys.
Can I get rid of the "dateStrs" variable?

Thank you.


You want to build a list, but you don't want to give that list a name?  
Why not?  And how would you refer to that list in the second loop if it 
didn't have a name?


And concerning that second loop:  What are you trying to do there? It 
looks like a complete waste of time.  In fact, with what you've shown 
us, you can eliminate the variable dateStrs, and both loops and be no 
worse off.


Perhaps there is more to your code than you've shown to us ...

Gary Herron

--
https://mail.python.org/mailman/listinfo/python-list


Re: Managing Google Groups headaches

2013-12-06 Thread Gregory Ewing

rusi wrote:

On Friday, December 6, 2013 1:06:30 PM UTC+5:30, Roy Smith wrote:

Which means, if I wanted to (and many examples of this exist), I can 
write my own client which presents the same information in different 
ways.


Not sure whats your point.


The point is the existence of an alternative interface that's
designed for use by other programs rather than humans.

This is what web forums are missing. If it existed, one could
easily create an alternative client with a newsreader-like
interface. Without it, such a client would have to be a
monstrosity that worked by screen-scraping the html.

It's not about the format of the messages themselves -- that
could be text, or html, or reST, or bbcode or whatever. It's
about the *framing* of the messages, and being able to
query them by their metadata.

--
Greg
--
https://mail.python.org/mailman/listinfo/python-list


Eliminate "extra" variable

2013-12-06 Thread Igor Korot
Hi, ALL,
I have following code:

def MyFunc(self, originalData):
 data = {}
 dateStrs = []
 for i in xrange(0, len(originalData)):
   dateStr, freq, source = originalData[i]
   data[str(dateStr)]  = {source: freq}
   dateStrs.append(dateStr)
for i in xrange(0, len(dateStrs) - 1):
  currDateStr = str(dateStrs[i])
  nextDateStrs = str(dateStrs[i + 1])


It seems very strange that I need the dateStrs list just for the
purpose of looping thru the dictionary keys.
Can I get rid of the "dateStrs" variable?

Thank you.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: squeeze out some performance

2013-12-06 Thread Dan Stromberg
On Fri, Dec 6, 2013 at 2:38 PM, Mark Lawrence wrote:

> On 06/12/2013 16:52, John Ladasky wrote:
>
>> On Friday, December 6, 2013 12:47:54 AM UTC-8, Robert Voigtländer wrote:
>>
>>  I try to squeeze out some performance of the code pasted on the link
>>> below.
>>> http://pastebin.com/gMnqprST
>>>
>>
>> Several comments:
>>
>> 1) I find this program to be very difficult to read, largely because
>> there's a whole LOT of duplicated code.  Look at lines 53-80, and lines
>> 108-287, and lines 294-311.  It makes it harder to see what this algorithm
>> actually does.  Is there a way to refactor some of this code to use some
>> shared function calls?
>>
>>
> A handy tool for detecting duplicated code here
> http://clonedigger.sourceforge.net/ for anyone who's interested.
>

Pylint does this too...
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: squeeze out some performance

2013-12-06 Thread Mark Lawrence

On 06/12/2013 16:52, John Ladasky wrote:

On Friday, December 6, 2013 12:47:54 AM UTC-8, Robert Voigtländer wrote:


I try to squeeze out some performance of the code pasted on the link below.
http://pastebin.com/gMnqprST


Several comments:

1) I find this program to be very difficult to read, largely because there's a 
whole LOT of duplicated code.  Look at lines 53-80, and lines 108-287, and 
lines 294-311.  It makes it harder to see what this algorithm actually does.  
Is there a way to refactor some of this code to use some shared function calls?



A handy tool for detecting duplicated code here 
http://clonedigger.sourceforge.net/ for anyone who's interested.


--
My fellow Pythonistas, ask not what our language can do for you, ask 
what you can do for our language.


Mark Lawrence

--
https://mail.python.org/mailman/listinfo/python-list


Re: Embedding multiple interpreters

2013-12-06 Thread Gregory Ewing

Garthy wrote:

The bare minimum would be 
protection against inadvertent interaction. Better yet would be a setup 
that made such interaction annoyingly difficult, and the ideal would be 
where it was impossible to interfere.


To give you an idea of the kind of interference that's
possible, consider:

1) You can find all the subclasses of a given class
object using its __subclasses__() method.

2) Every class ultimately derives from class object.

3) All built-in class objects are shared between
interpreters.

So, starting from object.__subclasses__(), code in any
interpreter could find any class defined by any other
interpreter and mutate it.

This is not something that is likely to happen by
accident. Whether it's "annoyingly difficult" enough
is something you'll have to decide.

Also keep in mind that it's fairly easy for Python
code to chew up large amounts of memory and/or CPU
time in an uninterruptible way, e.g. by
evaluating 5**1. So even a thread that's
keeping its hands entirely to itself can still
cause trouble.

--
Greg
--
https://mail.python.org/mailman/listinfo/python-list


Re: squeeze out some performance

2013-12-06 Thread Joel Goldstick
On Fri, Dec 6, 2013 at 11:52 AM, John Ladasky wrote:

> On Friday, December 6, 2013 12:47:54 AM UTC-8, Robert Voigtländer wrote:
>
> > I try to squeeze out some performance of the code pasted on the link
> below.
> > http://pastebin.com/gMnqprST
>

Not that this will speed up your code but you have this:

if not clockwise:
s = start
start = end
end = s

Python people would write:
end, start = start, end


You have quite a few if statements that involve multiple comparisons of the
same variable.  Did you know you can do things like this in python:

>>> x = 4
>>> 2 < x < 7
True
>>> x = 55
>>> 2 < x < 7
False


> Several comments:
>
> 1) I find this program to be very difficult to read, largely because
> there's a whole LOT of duplicated code.  Look at lines 53-80, and lines
> 108-287, and lines 294-311.  It makes it harder to see what this algorithm
> actually does.  Is there a way to refactor some of this code to use some
> shared function calls?
>
> 2) I looked up the "Bresenham algorithm", and found two references which
> may be relevant.  The original algorithm was one which computed good raster
> approximations to straight lines.  The second algorithm described may be
> more pertinent to you, because it draws arcs of circles.
>
> http://en.wikipedia.org/wiki/Bresenham's_line_algorithm
> http://en.wikipedia.org/wiki/Midpoint_circle_algorithm
>
> Both of these algorithms are old, from the 1960's, and can be implemented
> using very simple CPU register operations and minimal memory.  Both of the
> web pages I referenced have extensive example code and pseudocode, and
> discuss optimization.  If you need speed, is this really a job for Python?
>
> 3) I THINK that I see some code -- those duplicated parts -- which might
> benefit from the use of multiprocessing (assuming that you have a
> multi-core CPU).  But I would have to read more deeply to be sure.  I need
> to understand the algorithm more completely, and exactly how you have
> modified it for your needs.
> --
> https://mail.python.org/mailman/listinfo/python-list
>



-- 
Joel Goldstick
http://joelgoldstick.com
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Embedding multiple interpreters

2013-12-06 Thread Gregory Ewing

Garthy wrote:
To allow each script to run in its own environment, with minimal chance 
of inadvertent interaction between the environments, whilst allowing 
each script the ability to stall on conditions that will be later met by 
another thread supplying the information, and to fit in with existing 
infrastructure.


The last time I remember this being discussed was in the context
of allowing free threading. Multiple interpreters don't solve
that problem, because there's still only one GIL and some
objects are shared.

But if all you want is for each plugin to have its own version
of sys.modules, etc., and you're not concerned about malicious
code, then it may be good enough.

It seems to be good enough for mod_wsgi, because presumably
all the people with the ability to install code on a given
web server trust each other.

--
Greg
--
https://mail.python.org/mailman/listinfo/python-list


Python 2.8 release schedule

2013-12-06 Thread Mark Lawrence
My apologies if you've seen this before but here is the official 
schedule http://www.python.org/dev/peps/pep-0404/


--
My fellow Pythonistas, ask not what our language can do for you, ask 
what you can do for our language.


Mark Lawrence

--
https://mail.python.org/mailman/listinfo/python-list


Re: ASCII and Unicode [was Re: Managing Google Groups headaches]

2013-12-06 Thread Roy Smith
Steven D'Aprano  pearwood.info> writes:

> Yes, it appears that MT-NewsWatcher is *deeply, deeply* confused about 
> encodings and character sets. It doesn't just assume things are ASCII, 
> but makes a half-hearted attempt to be charset-aware, but badly. I can 
> only imagine that it was written back in the Dark Ages

Indeed.  The basic codebase probably goes back 20 years.  I'm posting this
from gmane, just so people don't think I'm a total luddite.

> When transmitting ASCII characters, the networking protocol could include 
> various start and stop bits and parity codes. A single 7-bit ASCII 
> character might be anything up to 12 bits in length on the wire.

Not to mention that some really old hardware used 1.5 stop bits!


-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Does Python optimize low-power functions?

2013-12-06 Thread Oscar Benjamin
On 6 December 2013 18:16, John Ladasky  wrote:
> The following two functions return the same result:
>
> x**2
> x*x
>
> But they may be computed in different ways.  The first choice can accommodate 
> non-integer powers and so it would logically proceed by taking a logarithm, 
> multiplying by the power (in this case, 2), and then taking the 
> anti-logarithm.  But for a trivial value for the power like 2, this is 
> clearly a wasteful choice.  Just multiply x by itself, and skip the expensive 
> log and anti-log steps.
>
> My question is, what do Python interpreters do with power operators where the 
> power is a small constant, like 2?  Do they know to take the shortcut?

As mentioned this will depend on the interpreter and on the type of x.
Python's integer arithmetic is exact and unbounded so switching to
floating point and using approximate logarithms is a no go if x is an
int object.

For CPython specifically, you can see here:
http://hg.python.org/cpython/file/07ef52e751f3/Objects/floatobject.c#l741
that for floats x**2 will be equivalent to x**2.0 and will be handled
by the pow function from the underlying C math library. If you read
the comments around that line you'll see that different inconsistent
math libraries can do things very differently leading to all kinds of
different problems.

For CPython if x is an int (long) then as mentioned before it is
handled by the HAC algorithm:
http://hg.python.org/cpython/file/07ef52e751f3/Objects/longobject.c#l3934

For CPython if x is a complex then it is handled roughly as you say:
for x**n if n is between -100 and 100 then multiplication is performed
using the "bit-mask exponentiation" algorithm. Otherwise it is
computed by converting to polar exponential form and using logs (see
also the two functions above this one):
http://hg.python.org/cpython/file/07ef52e751f3/Objects/complexobject.c#l151


Oscar
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Does Python optimize low-power functions?

2013-12-06 Thread John Ladasky
On Friday, December 6, 2013 11:32:00 AM UTC-8, Nick Cash wrote:

> The reasons why have already been answered, I just wanted to point out that 
> Python makes it extremely easy to check these sorts of things for yourself.

Thanks for the heads-up on the dis module, Nick.  I haven't played with that 
one yet.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: ASCII and Unicode [was Re: Managing Google Groups headaches]

2013-12-06 Thread Gene Heskett
On Friday 06 December 2013 14:30:06 Steven D'Aprano did opine:

> On Fri, 06 Dec 2013 05:03:57 -0800, rusi wrote:
> > Evidently (and completely inadvertently) this exchange has just
> > illustrated one of the inadmissable assumptions:
> > 
> > "unicode as a medium is universal in the same way that ASCII used to
> > be"
> 
> Ironically, your post was not Unicode.
> 
> Seriously. I am 100% serious.
> 
> Your post was sent using a legacy encoding, Windows-1252, also known as
> CP-1252, which is most certainly *not* Unicode. Whatever software you
> used to send the message correctly flagged it with a charset header:
> 
> Content-Type: text/plain; charset=windows-1252
> 
> Alas, the software Roy Smith uses, MT-NewsWatcher, does not handle
> encodings correctly (or at all!), it screws up the encoding then sends a
> reply with no charset line at all. This is one bug that cannot be blamed
> on Google Groups -- or on Unicode.
> 
> > I wrote a number of ellipsis characters ie codepoint 2026 as in:
> Actually you didn't. You wrote a number of ellipsis characters, hex byte
> \x85 (decimal 133), in the CP1252 charset. That happens to be mapped to
> code point U+2026 in Unicode, but the two are as distinct as ASCII and
> EBCDIC.
> 
> > Somewhere between my sending and your quoting those ellipses became
> > the replacement character FFFD
> 
> Yes, it appears that MT-NewsWatcher is *deeply, deeply* confused about
> encodings and character sets. It doesn't just assume things are ASCII,
> but makes a half-hearted attempt to be charset-aware, but badly. I can
> only imagine that it was written back in the Dark Ages where there were
> a lot of different charsets in use but no conventions for specifying
> which charset was in use. Or perhaps the author was smoking crack while
> coding.
> 
> > Leaving aside whose fault this is (very likely buggy google groups),
> > this mojibaking cannot happen if the assumption "All text is ASCII"
> > were to uniformly hold.
> 
> This is incorrect. People forget that ASCII has evolved since the first
> version of the standard in 1963. There have actually been five versions
> of the ASCII standard, plus one unpublished version. (And that's not
> including the things which are frequently called ASCII but aren't.)
> 
> ASCII-1963 didn't even include lowercase letters. It is also missing
> some graphic characters like braces, and included at least two
> characters no longer used, the up-arrow and left-arrow. The control
> characters were also significantly different from today.
> 
> ASCII-1965 was unpublished and unused. I don't know the details of what
> it changed.
> 
> ASCII-1967 is a lot closer to the ASCII in use today. It made
> considerable changes to the control characters, moving, adding,
> removing, or renaming at least half a dozen control characters. It
> officially added lowercase letters, braces, and some others. It
> replaced the up-arrow character with the caret and the left-arrow with
> the underscore. It was ambiguous, allowing variations and
> substitutions, e.g.:
> 
> - character 33 was permitted to be either the exclamation
>   mark ! or the logical OR symbol |
> 
> - consequently character 124 (vertical bar) was always
>   displayed as a broken bar آ¦, which explains why even today
>   many keyboards show it that way
> 
> - character 35 was permitted to be either the number sign # or
>   the pound sign آ£
> 
> - character 94 could be either a caret ^ or a logical NOT آ¬
> 
> Even the humble comma could be pressed into service as a cedilla.
> 
> ASCII-1968 didn't change any characters, but allowed the use of LF on
> its own. Previously, you had to use either LF/CR or CR/LF as newline.
> 
> ASCII-1977 removed the ambiguities from the 1967 standard.
> 
> The most recent version is ASCII-1986 (also known as ANSI X3.4-1986).
> Unfortunately I haven't been able to find out what changes were made --
> I presume they were minor, and didn't affect the character set.
> 
> So as you can see, even with actual ASCII, you can have mojibake. It's
> just not normally called that. But if you are given an arbitrary ASCII
> file of unknown age, containing code 94, how can you be sure it was
> intended as a caret rather than a logical NOT symbol? You can't.
> 
> Then there are at least 30 official variations of ASCII, strictly
> speaking part of ISO-646. These 7-bit codes were commonly called "ASCII"
> by their users, despite the differences, e.g. replacing the dollar sign
> $ with the international currency sign آ¤, or replacing the left brace
> { with the letter s with caron إ،.
> 
> One consequence of this is that the MIME type for ASCII text is called
> "US ASCII", despite the redundancy, because many people expect "ASCII"
> alone to mean whatever national variation they are used to.
> 
> But it gets worse: there are proprietary variations on ASCII which are
> commonly called "ASCII" but aren't, including dozens of 8-bit so-called
> "extended ASCII" character 

RE: Does Python optimize low-power functions?

2013-12-06 Thread Nick Cash
>My question is, what do Python interpreters do with power operators where the 
>power is a small constant, like 2?  Do they know to take the shortcut?

Nope:

Python 3.3.0 (default, Sep 25 2013, 19:28:08) 
[GCC 4.7.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import dis
>>> dis.dis(lambda x: x*x)
  1   0 LOAD_FAST0 (x) 
  3 LOAD_FAST0 (x) 
  6 BINARY_MULTIPLY  
  7 RETURN_VALUE 
>>> dis.dis(lambda x: x**2)
  1   0 LOAD_FAST0 (x) 
  3 LOAD_CONST   1 (2) 
  6 BINARY_POWER 
  7 RETURN_VALUE 


The reasons why have already been answered, I just wanted to point out that 
Python makes it extremely easy to check these sorts of things for yourself.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Does Python optimize low-power functions?

2013-12-06 Thread Robert Kern

On 2013-12-06 19:01, Neil Cerutti wrote:

On 2013-12-06, John Ladasky  wrote:

The following two functions return the same result:

 x**2
 x*x

But they may be computed in different ways.  The first choice
can accommodate non-integer powers and so it would logically
proceed by taking a logarithm, multiplying by the power (in
this case, 2), and then taking the anti-logarithm.  But for a
trivial value for the power like 2, this is clearly a wasteful
choice.  Just multiply x by itself, and skip the expensive log
and anti-log steps.

My question is, what do Python interpreters do with power
operators where the power is a small constant, like 2?  Do they
know to take the shortcut?


It uses a couple of fast algorithms for computing powers. Here's
the excerpt with the comments identifying the algorithms used.
 From longobject.c:

2873 if (Py_SIZE(b) <= FIVEARY_CUTOFF) {
2874 /* Left-to-right binary exponentiation (HAC Algorithm 14.79) */
2875 /* http://www.cacr.math.uwaterloo.ca/hac/about/chap14.pdf*/
...
2886 else {
2887 /* Left-to-right 5-ary exponentiation (HAC Algorithm 14.82) */


It's worth noting that the *interpreter* per se is not doing this. The 
implementation of the `long` object does this in its implementation of the 
`__pow__` method, which the interpreter invokes. Other objects may implement 
this differently and use whatever optimizations they like. They may even (ab)use 
the syntax for things other than numerical exponentiation where `x**2` is not 
equivalent to `x*x`. Since objects are free to do so, the interpreter itself 
cannot choose to optimize that exponentiation down to multiplication.


--
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco

--
https://mail.python.org/mailman/listinfo/python-list


ASCII and Unicode [was Re: Managing Google Groups headaches]

2013-12-06 Thread Steven D'Aprano
On Fri, 06 Dec 2013 05:03:57 -0800, rusi wrote:

> Evidently (and completely inadvertently) this exchange has just
> illustrated one of the inadmissable assumptions:
> 
> "unicode as a medium is universal in the same way that ASCII used to be"

Ironically, your post was not Unicode.

Seriously. I am 100% serious.

Your post was sent using a legacy encoding, Windows-1252, also known as 
CP-1252, which is most certainly *not* Unicode. Whatever software you 
used to send the message correctly flagged it with a charset header:

Content-Type: text/plain; charset=windows-1252

Alas, the software Roy Smith uses, MT-NewsWatcher, does not handle 
encodings correctly (or at all!), it screws up the encoding then sends a 
reply with no charset line at all. This is one bug that cannot be blamed 
on Google Groups -- or on Unicode.


> I wrote a number of ellipsis characters ie codepoint 2026 as in:

Actually you didn't. You wrote a number of ellipsis characters, hex byte 
\x85 (decimal 133), in the CP1252 charset. That happens to be mapped to 
code point U+2026 in Unicode, but the two are as distinct as ASCII and 
EBCDIC.


> Somewhere between my sending and your quoting those ellipses became the
> replacement character FFFD

Yes, it appears that MT-NewsWatcher is *deeply, deeply* confused about 
encodings and character sets. It doesn't just assume things are ASCII, 
but makes a half-hearted attempt to be charset-aware, but badly. I can 
only imagine that it was written back in the Dark Ages where there were a 
lot of different charsets in use but no conventions for specifying which 
charset was in use. Or perhaps the author was smoking crack while coding.


> Leaving aside whose fault this is (very likely buggy google groups),
> this mojibaking cannot happen if the assumption "All text is ASCII" were
> to uniformly hold.

This is incorrect. People forget that ASCII has evolved since the first 
version of the standard in 1963. There have actually been five versions 
of the ASCII standard, plus one unpublished version. (And that's not 
including the things which are frequently called ASCII but aren't.)

ASCII-1963 didn't even include lowercase letters. It is also missing some 
graphic characters like braces, and included at least two characters no 
longer used, the up-arrow and left-arrow. The control characters were 
also significantly different from today.

ASCII-1965 was unpublished and unused. I don't know the details of what 
it changed.

ASCII-1967 is a lot closer to the ASCII in use today. It made 
considerable changes to the control characters, moving, adding, removing, 
or renaming at least half a dozen control characters. It officially added 
lowercase letters, braces, and some others. It replaced the up-arrow 
character with the caret and the left-arrow with the underscore. It was 
ambiguous, allowing variations and substitutions, e.g.:

- character 33 was permitted to be either the exclamation 
  mark ! or the logical OR symbol |

- consequently character 124 (vertical bar) was always 
  displayed as a broken bar ¦, which explains why even today
  many keyboards show it that way

- character 35 was permitted to be either the number sign # or 
  the pound sign £

- character 94 could be either a caret ^ or a logical NOT ¬

Even the humble comma could be pressed into service as a cedilla.

ASCII-1968 didn't change any characters, but allowed the use of LF on its 
own. Previously, you had to use either LF/CR or CR/LF as newline.

ASCII-1977 removed the ambiguities from the 1967 standard.

The most recent version is ASCII-1986 (also known as ANSI X3.4-1986). 
Unfortunately I haven't been able to find out what changes were made -- I 
presume they were minor, and didn't affect the character set.

So as you can see, even with actual ASCII, you can have mojibake. It's 
just not normally called that. But if you are given an arbitrary ASCII 
file of unknown age, containing code 94, how can you be sure it was 
intended as a caret rather than a logical NOT symbol? You can't.

Then there are at least 30 official variations of ASCII, strictly 
speaking part of ISO-646. These 7-bit codes were commonly called "ASCII" 
by their users, despite the differences, e.g. replacing the dollar sign $ 
with the international currency sign ¤, or replacing the left brace 
{ with the letter s with caron š.

One consequence of this is that the MIME type for ASCII text is called 
"US ASCII", despite the redundancy, because many people expect "ASCII" 
alone to mean whatever national variation they are used to.

But it gets worse: there are proprietary variations on ASCII which are 
commonly called "ASCII" but aren't, including dozens of 8-bit so-called 
"extended ASCII" character sets, which is where the problems *really* 
pile up. Invariably back in the 1980s and early 1990s people used to call 
these "ASCII" no matter that they used 8-bits and contained anything up 
to 256 characters.

Just because somebody 

Re: Does Python optimize low-power functions?

2013-12-06 Thread Neil Cerutti
On 2013-12-06, John Ladasky  wrote:
> The following two functions return the same result:
>
> x**2
> x*x
>
> But they may be computed in different ways.  The first choice
> can accommodate non-integer powers and so it would logically
> proceed by taking a logarithm, multiplying by the power (in
> this case, 2), and then taking the anti-logarithm.  But for a
> trivial value for the power like 2, this is clearly a wasteful
> choice.  Just multiply x by itself, and skip the expensive log
> and anti-log steps.
> 
> My question is, what do Python interpreters do with power
> operators where the power is a small constant, like 2?  Do they
> know to take the shortcut?

It uses a couple of fast algorithms for computing powers. Here's
the excerpt with the comments identifying the algorithms used.
>From longobject.c:

2873 if (Py_SIZE(b) <= FIVEARY_CUTOFF) {
2874 /* Left-to-right binary exponentiation (HAC Algorithm 14.79) */
2875 /* http://www.cacr.math.uwaterloo.ca/hac/about/chap14.pdf*/
...
2886 else {
2887 /* Left-to-right 5-ary exponentiation (HAC Algorithm 14.82) */

The only outright optimization of the style I think your
describing that I can see is it quickly returns zero when modulus
is one.

I'm not a skilled or experienced CPython source reader, though.

-- 
Neil Cerutti

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Does Python optimize low-power functions?

2013-12-06 Thread Jean-Michel Pichavant
- Original Message -
> The following two functions return the same result:
> 
> x**2
> x*x
> 
> But they may be computed in different ways.  The first choice can
> accommodate non-integer powers and so it would logically proceed by
> taking a logarithm, multiplying by the power (in this case, 2), and
> then taking the anti-logarithm.  But for a trivial value for the
> power like 2, this is clearly a wasteful choice.  Just multiply x by
> itself, and skip the expensive log and anti-log steps.
> 
> My question is, what do Python interpreters do with power operators
> where the power is a small constant, like 2?  Do they know to take
> the shortcut?
> --
> https://mail.python.org/mailman/listinfo/python-list

It is probably specific to the interpreter implementation(cython, jython, iron 
python etc...). You'd better optimize it yourself should you really care about 
this.
An alternative is to use numpy functions, like numpy.power, they are optimized 
version of most mathematical functions.

JM


-- IMPORTANT NOTICE: 

The contents of this email and any attachments are confidential and may also be 
privileged. If you are not the intended recipient, please notify the sender 
immediately and do not disclose the contents to any other person, use it for 
any purpose, or store or copy the information in any medium. Thank you.
-- 
https://mail.python.org/mailman/listinfo/python-list


Does Python optimize low-power functions?

2013-12-06 Thread John Ladasky
The following two functions return the same result:

x**2
x*x

But they may be computed in different ways.  The first choice can accommodate 
non-integer powers and so it would logically proceed by taking a logarithm, 
multiplying by the power (in this case, 2), and then taking the anti-logarithm. 
 But for a trivial value for the power like 2, this is clearly a wasteful 
choice.  Just multiply x by itself, and skip the expensive log and anti-log 
steps.

My question is, what do Python interpreters do with power operators where the 
power is a small constant, like 2?  Do they know to take the shortcut?
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: [newbie] problem trying out simple non object oriented use of Tkinter

2013-12-06 Thread Christian Gollwitzer

Am 06.12.13 14:12, schrieb Jean Dubois:

It works but it's not all clear to me. Can you tell me what "label.bind("<1>", quit)" 
is standing for? What's the <1> meaning?


"bind" connects events sent to the label with a handler. The <1> is the 
event description; in this case, it means a click with the left mouse 
button. The mouse buttons are numbered 1,2,3 for left,middle,right, 
respectively (with right and middle switched on OSX, confusingly). It is 
actually short for





Binding to the key "1" would look like this



The event syntax is rather complex, for example it is possible to add 
modifiers to bind to a Shift-key + right click like this




It is described in detail at the bind man page of Tk.

http://www.tcl.tk/man/tcl8.6/TkCmd/bind.htm

The event object passed to the handler contains additional information, 
for instance the position of the mouse pointer on the screen.


In practice, for large parts of the interface you do not mess with the 
keyboard and mouse events directly, but use the corresponding widgets.
In your program, the label works as a simple pushbutton, and therefore a 
button should be used.


#!/usr/bin/env python
import Tkinter as tk
import ttk # for modern widgets
import sys

# no underscore - nothing gets passed
def quit():
sys.exit()

root = tk.Tk()
button = ttk.Button(root, text="Click mouse here to quit", command=quit)
button.pack()
root.mainloop()


note, that

1) nothing gets passed, so we could have left out changing quit(). This 
is because a button comand usually does not care about details of the 
mouse click. It just reacts as the user expects.


2) I use ttk widgets, which provide native look&feel. If possible, use 
those. Good examples on ttk usage are shown at 
http://www.tkdocs.com/tutorial/index.html


HTH,
Christia
--
https://mail.python.org/mailman/listinfo/python-list


interactive help on the base object

2013-12-06 Thread Mark Lawrence

Is it just me, or is this basically useless?

>>> help(object)
Help on class object in module builtins:

class object
 |  The most base type

>>>

Surely a few more words, or a pointer to this 
http://docs.python.org/3/library/functions.html#object, would be better?


--
My fellow Pythonistas, ask not what our language can do for you, ask 
what you can do for our language.


Mark Lawrence

--
https://mail.python.org/mailman/listinfo/python-list


Re: using ffmpeg command line with python's subprocess module

2013-12-06 Thread rusi
On Friday, December 6, 2013 10:11:04 PM UTC+5:30, MRAB wrote:
> On 06/12/2013 15:34, Steven D'Aprano wrote:
> > On Fri, 06 Dec 2013 06:52:48 -0800, iMath wrote:
> >> yes ,I am a native Chinese speaker.I always post question by Google
> >> Group not through  email ,is there something wrong with it ? your
> >> english is a little strange to me .

> > Mark is writing in fake old-English style, the way people think English
> > was spoken a thousand years ago. I don't know why he did that. Perhaps he
> > thought it was amusing.
> [snip]

> You're exaggerating. It's more like 500 years ago. :-)

I was going to say the same until I noticed the "the way people think English
was spoken..."

That makes it unarguable -- surely there are some people who (wrongly) think so?
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: squeeze out some performance

2013-12-06 Thread John Ladasky
On Friday, December 6, 2013 12:47:54 AM UTC-8, Robert Voigtländer wrote:
 
> I try to squeeze out some performance of the code pasted on the link below.
> http://pastebin.com/gMnqprST

Several comments:

1) I find this program to be very difficult to read, largely because there's a 
whole LOT of duplicated code.  Look at lines 53-80, and lines 108-287, and 
lines 294-311.  It makes it harder to see what this algorithm actually does.  
Is there a way to refactor some of this code to use some shared function calls?

2) I looked up the "Bresenham algorithm", and found two references which may be 
relevant.  The original algorithm was one which computed good raster 
approximations to straight lines.  The second algorithm described may be more 
pertinent to you, because it draws arcs of circles.

http://en.wikipedia.org/wiki/Bresenham's_line_algorithm
http://en.wikipedia.org/wiki/Midpoint_circle_algorithm

Both of these algorithms are old, from the 1960's, and can be implemented using 
very simple CPU register operations and minimal memory.  Both of the web pages 
I referenced have extensive example code and pseudocode, and discuss 
optimization.  If you need speed, is this really a job for Python?

3) I THINK that I see some code -- those duplicated parts -- which might 
benefit from the use of multiprocessing (assuming that you have a multi-core 
CPU).  But I would have to read more deeply to be sure.  I need to understand 
the algorithm more completely, and exactly how you have modified it for your 
needs.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Why is there no natural syntax for accessing attributes with names not being valid identifiers?

2013-12-06 Thread Piotr Dobrogost
On Friday, December 6, 2013 3:07:51 PM UTC+1, Neil Cerutti wrote:
> On 2013-12-04, Piotr Dobrogost
> 
>  wrote:
> 
> > On Wednesday, December 4, 2013 10:41:49 PM UTC+1, Neil Cerutti
> > wrote:
> 
> >> not something to do commonly. Your proposed syntax leaves the
> >> distinction between valid and invalid identifiers a problem
> >> the programmer has to deal with. It doesn't unify access to
> 
> >> attributes the way the getattr and setattr do.
> 
> >
> 
> > Taking into account that obj.'x' would be equivalent to obj.x
> > any attribute can be accessed with the new syntax. I don't see
> > how this is not unified access compared to using getattr
> > instead dot...
> 
> I thought of that argument later the next day. Your proposal does
> unify access if the old obj.x syntax is removed.

As long as obj.x is a very concise way to get attribute named 'x' from object 
obj it's somehow odd that identifier x is treated not like identifier but like 
string literal 'x'. If it were treated like an identifier then we would get 
attribute with name being value of x instead attribute named 'x'. Making it 
possible to use string literals in the form obj.'x' as proposed this would make 
getattr basically needless as long as we use only variable not expression to 
denote attribute's name.
This is just casual remark.


Regards,
Piotr
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: using ffmpeg command line with python's subprocess module

2013-12-06 Thread rusi
On Friday, December 6, 2013 9:55:54 PM UTC+5:30, Mark Lawrence wrote:
> On 06/12/2013 16:19, rusi wrote:

> > So someone please update that page!

> This is a community so why don't you?

Ok done (at least a first draft)
I was under the impression that anyone could not edit
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: squeeze out some performance

2013-12-06 Thread Robert Voigtländer
Am Freitag, 6. Dezember 2013 17:36:03 UTC+1 schrieb Mark Lawrence:

> > I already did some basic profiling and optimized a lot. Especially  > with 
> > help of a goof python performance tips list I found.
> 
> Wonderful typo -^ :)
> 

Oh well :-) ... it was a good one. Just had a quick look at Cython. Looks 
great. Thanks for the tip.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: using ffmpeg command line with python's subprocess module

2013-12-06 Thread MRAB

On 06/12/2013 15:34, Steven D'Aprano wrote:

On Fri, 06 Dec 2013 06:52:48 -0800, iMath wrote:


yes ,I am a native Chinese speaker.I always post question by Google
Group not through  email ,is there something wrong with it ? your
english is a little strange to me .


Mark is writing in fake old-English style, the way people think English
was spoken a thousand years ago. I don't know why he did that. Perhaps he
thought it was amusing.


[snip]
You're exaggerating. It's more like 500 years ago. :-)

--
https://mail.python.org/mailman/listinfo/python-list


Re: squeeze out some performance

2013-12-06 Thread Mark Lawrence

On 06/12/2013 16:29, Robert Voigtländer wrote:

Thanks for your replies.

I already did some basic profiling and optimized a lot. Especially  > with help 
of a goof python performance tips list I found.



Wonderful typo -^ :)


I think I'll follow the cython path.
The geometry approach also sound good. But it's way above my math/geometry 
knowledge.

Thanks for your input!




--
My fellow Pythonistas, ask not what our language can do for you, ask 
what you can do for our language.


Mark Lawrence

--
https://mail.python.org/mailman/listinfo/python-list


Re: squeeze out some performance

2013-12-06 Thread Robert Voigtländer
Thanks for your replies.

I already did some basic profiling and optimized a lot. Especially with help of 
a goof python performance tips list I found.

I think I'll follow the cython path.
The geometry approach also sound good. But it's way above my math/geometry 
knowledge.

Thanks for your input!
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: using ffmpeg command line with python's subprocess module

2013-12-06 Thread Mark Lawrence

On 06/12/2013 16:19, rusi wrote:

On Friday, December 6, 2013 9:23:47 PM UTC+5:30, Mark Lawrence wrote:

On 06/12/2013 15:34, Steven D'Aprano wrote:

(if I remember correctly) I think Mark also


sometimes posts a link to managing Google Groups.






You do, and here it is https://wiki.python.org/moin/GoogleGroupsPython


That link needs updating.

Even if my almost-automatic correction methods are not considered
kosher for some reason or other, the thing that needs to go in there
is that GG has TWO problems

1. Blank lines
2. Long lines

That link only describes 1.

Roy's yesterday's post in "Packaging a proprietary python library"
says:


I, and Rusi, know enough, and take the effort, to overcome its
shortcomings doesn't change that.


But in fact his post takes care of 1 not 2.

In all fairness I did not know that 2 is a problem until rurpy pointed
it out recently and was not correcting it. In fact, I'd take the
trouble to make the lines long assuming that clients were intelligent
enough to fit it properly into whatever was the current window!!!

So someone please update that page!



This is a community so why don't you?

--
My fellow Pythonistas, ask not what our language can do for you, ask 
what you can do for our language.


Mark Lawrence

--
https://mail.python.org/mailman/listinfo/python-list


Re: using ffmpeg command line with python's subprocess module

2013-12-06 Thread rusi
On Friday, December 6, 2013 9:23:47 PM UTC+5:30, Mark Lawrence wrote:
> On 06/12/2013 15:34, Steven D'Aprano wrote:
> 
> (if I remember correctly) I think Mark also
> 
> > sometimes posts a link to managing Google Groups.
> 
> >
> 
> You do, and here it is https://wiki.python.org/moin/GoogleGroupsPython

That link needs updating.

Even if my almost-automatic correction methods are not considered
kosher for some reason or other, the thing that needs to go in there
is that GG has TWO problems

1. Blank lines
2. Long lines

That link only describes 1.

Roy's yesterday's post in "Packaging a proprietary python library"
says:

> I, and Rusi, know enough, and take the effort, to overcome its
> shortcomings doesn't change that.

But in fact his post takes care of 1 not 2.

In all fairness I did not know that 2 is a problem until rurpy pointed
it out recently and was not correcting it. In fact, I'd take the
trouble to make the lines long assuming that clients were intelligent
enough to fit it properly into whatever was the current window!!!

So someone please update that page!
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: using ffmpeg command line with python's subprocess module

2013-12-06 Thread Mark Lawrence

On 06/12/2013 15:34, Steven D'Aprano wrote:
(if I remember correctly) I think Mark also

sometimes posts a link to managing Google Groups.



You do, and here it is https://wiki.python.org/moin/GoogleGroupsPython

--
My fellow Pythonistas, ask not what our language can do for you, ask 
what you can do for our language.


Mark Lawrence

--
https://mail.python.org/mailman/listinfo/python-list


Re: Packaging a proprietary Python library for multiple OSs

2013-12-06 Thread Kevin Walzer

On 12/5/13, 10:50 AM, Michael Herrmann wrote:

On Thursday, December 5, 2013 4:26:40 PM UTC+1, Kevin Walzer wrote:

On 12/5/13, 5:14 AM, Michael Herrmann wrote:
If your library and their dependencies are simply .pyc files, then I
don't see why a zip collated via py2exe wouldn't work on other
platforms. Obviously this point is moot if your library includes true
compiled (C-based) extensions.


As I said, I need to make my *build* platform-independent.


Giving this further thought, I'm wondering how hard it would be to roll 
your own using modulefinder, Python's zip tools, and some custom code. 
Just sayin'.


--Kevin


--
Kevin Walzer
Code by Kevin/Mobile Code by Kevin
http://www.codebykevin.com
http://www.wtmobilesoftware.com
--
https://mail.python.org/mailman/listinfo/python-list


Re: using ffmpeg command line with python's subprocess module

2013-12-06 Thread Steven D'Aprano
On Fri, 06 Dec 2013 06:52:48 -0800, iMath wrote:

> yes ,I am a native Chinese speaker.I always post question by Google
> Group not through  email ,is there something wrong with it ? your
> english is a little strange to me .

Mark is writing in fake old-English style, the way people think English 
was spoken a thousand years ago. I don't know why he did that. Perhaps he 
thought it was amusing.

There are many problems with Google Groups. If you pay attention to this 
forum, you will see dozens of posts about "Managing Google Groups 
headaches" and other complaints:

- Google Groups double-spaces replies, so text which should appear like:

line one
line two
line three
line four

  turns into:

line one
blank line
line two
blank line
line three
blank line
line four

- Google Groups often starts sending HTML code instead of plain text

- it often mangles indentation, which is terrible for Python code

- sometimes it automatically sets the reply address for posts to go
  to Google Groups, instead of the mailing list it should go to

- almost all of the spam on his forum comes from Google Groups, so many 
  people automatically filter everything from Google Groups straight to
  the trash.

There are alternatives to Google Groups:

- the mailing list, python-list@python.org

- Usenet, comp.lang.python

- the Gmane mirror:

  http://gmane.org/find.php?list=python-list%40python.org


and possibly others. You will maximise the number of people reading your 
posts if you avoid Google Groups. If for some reason you cannot use any 
of the alternatives, please take the time to fix some of the problems 
with Google Groups. If you search the archives, you should find some 
posts by Rusi defending Google Groups and explaining what he does to make 
it more presentable, and (if I remember correctly) I think Mark also 
sometimes posts a link to managing Google Groups.



-- 
Steven
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: using ffmpeg command line with python's subprocess module

2013-12-06 Thread rusi
On Friday, December 6, 2013 8:42:02 PM UTC+5:30, Mark Lawrence wrote:
> The English I used was archaic, please ignore it :)

"Archaic" is almost archaic
"Old" is ever-young

:D
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: using ffmpeg command line with python's subprocess module

2013-12-06 Thread rusi
On Friday, December 6, 2013 8:22:48 PM UTC+5:30, iMath wrote:
> 在 2013年12月6日星期五UTC+8下午5时23分59秒,Mark Lawrence写道:
> > On 06/12/2013 06:23, iMath wrote:
> > Dearest iMath, wouldst thou be kind enough to partake of obtaining some 
> > type of email client that dost not sendeth double spaced data into this 
> > most illustrious of mailing lists/newsgroups.  Thanking thee for thine 
> > participation in my most humble of requests.  I do remain your most 
> > obedient servant.

> yes ,I am a native Chinese speaker.I always post question by Google Group not 
> through  email ,is there something wrong with it ?

Yes but its easily correctable

I recently answered this question to another poster here

https://groups.google.com/forum/#!searchin/comp.lang.python/rusi$20google$20groups|sort:date/comp.lang.python/C51hEvi-KbY/KSeaMFoHtcIJ
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: using ffmpeg command line with python's subprocess module

2013-12-06 Thread Mark Lawrence

On 06/12/2013 14:52, iMath wrote:

在 2013年12月6日星期五UTC+8下午5时23分59秒,Mark Lawrence写道:

On 06/12/2013 06:23, iMath wrote:



Dearest iMath, wouldst thou be kind enough to partake of obtaining some

type of email client that dost not sendeth double spaced data into this

most illustrious of mailing lists/newsgroups.  Thanking thee for thine

participation in my most humble of requests.  I do remain your most

obedient servant.



--

My fellow Pythonistas, ask not what our language can do for you, ask

what you can do for our language.



Mark Lawrence


yes ,I am a native Chinese speaker.I always post question by Google Group not 
through  email ,is there something wrong with it ?
your english is a little strange to me .



You can see the extra lines inserted by google groups above.  It's not 
too bad in one and only one message, but when a message has been 
backwards and forwards several times it's extremely irritating, or worse 
still effectively unreadable.  Work arounds have been posted on this 
list, but I'd recommend using any decent email client.


The English I used was archaic, please ignore it :)

--
My fellow Pythonistas, ask not what our language can do for you, ask 
what you can do for our language.


Mark Lawrence

--
https://mail.python.org/mailman/listinfo/python-list


Re: using ffmpeg command line with python's subprocess module

2013-12-06 Thread Chris Angelico
On Sat, Dec 7, 2013 at 1:54 AM, iMath  wrote:
> fp=tempfile.NamedTemporaryFile(delete=False)
> fp.write(("file '"+fileName1+"'\n").encode('utf-8'))
> fp.write(("file '"+fileName2+"'\n").encode('utf-8'))
>
>
> subprocess.call(['ffmpeg', '-f', 'concat','-i',fp.name, '-c',  'copy', 
> fileName])
> fp.close()

You need to close the file before getting the other process to use it.
Otherwise, it may not be able to open the file at all, and even if it
can, you might find that not all the data has been written.

But congrats! You have successfully found the points I was directing
you to. Yes, I was hinting that you need NamedTemporaryFile, the .name
attribute, and delete=False. Good job!

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: using ffmpeg command line with python's subprocess module

2013-12-06 Thread iMath
在 2013年12月4日星期三UTC+8下午6时51分49秒,Chris Angelico写道:
> On Wed, Dec 4, 2013 at 8:38 PM, Andreas Perstinger  
> wrote:
> 
> > "fp" is a file object, but subprocess expects a list of strings as
> 
> > its first argument.
> 
> 
> 
> More fundamentally: The subprocess's arguments must include the *name*
> 
> of the file. This means you can't use TemporaryFile at all, as it's
> 
> not guaranteed to return an object that actually has a file name.
> 
> 
> 
> There's another problem, too, and that's that you're not closing the
> 
> file before expecting the subprocess to open it. And once you do that,
> 
> you'll find that the file no longer exists once it's been closed. In
> 
> fact, you'll need to research the tempfile module a bit to be able to
> 
> do what you want here; rather than spoon-feed you an exact solution,
> 
> I'll just say that there is one, and it can be found here:
> 
> 
> 
> http://docs.python.org/3.3/library/tempfile.html
> 
> 
> 
> ChrisA

I think you mean I should create a temporary file by NamedTemporaryFile(). 
After tried it many times, I found there is nearly no convenience in creating a 
temporary file or a persistent one here ,because we couldn't use the temporary 
file while it has not been closed ,so we couldn't depend on the convenience of 
letting the temporary file automatically delete itself when closing, we have to 
delete it later by os.remove() after it has been used in that command line.

code without the with statement is here ,but it is wrong ,it shows this line 

c:\docume~1\admini~1\locals~1\temp\tmp0d8959: Invalid data found when 
processing input


fp=tempfile.NamedTemporaryFile(delete=False)
fp.write(("file '"+fileName1+"'\n").encode('utf-8')) 
fp.write(("file '"+fileName2+"'\n").encode('utf-8')) 


subprocess.call(['ffmpeg', '-f', 'concat','-i',fp.name, '-c',  'copy', 
fileName])
fp.close()
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: using ffmpeg command line with python's subprocess module

2013-12-06 Thread iMath
在 2013年12月6日星期五UTC+8下午5时23分59秒,Mark Lawrence写道:
> On 06/12/2013 06:23, iMath wrote:
> 
> 
> 
> Dearest iMath, wouldst thou be kind enough to partake of obtaining some 
> 
> type of email client that dost not sendeth double spaced data into this 
> 
> most illustrious of mailing lists/newsgroups.  Thanking thee for thine 
> 
> participation in my most humble of requests.  I do remain your most 
> 
> obedient servant.
> 
> 
> 
> -- 
> 
> My fellow Pythonistas, ask not what our language can do for you, ask 
> 
> what you can do for our language.
> 
> 
> 
> Mark Lawrence

yes ,I am a native Chinese speaker.I always post question by Google Group not 
through  email ,is there something wrong with it ?
your english is a little strange to me .
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Managing Google Groups headaches

2013-12-06 Thread Chris Angelico
On Sat, Dec 7, 2013 at 1:11 AM, rusi  wrote:
> Aha! There you are! Its 'page editor' here and not the html which
> 'display source' (control-u) which a browser would show. And wikimedia
> is the software that mediates.
>
> The usual direction (seen by users of wikipedia) is that wikimedia
> takes this text, along with the other unrelated (metadata?) seen
> around -- sidebar, tabs etc, css settings and munges it all into html
>
> The other direction (seen by editors of wikipedia) is that you edit a
> page and that page and history etc will show the changes,
> reflecting the fact that the SQL content has changed.

MediaWiki is fundamentally very similar to a structure that I'm trying
to deploy for a community web site that I host, approximately thus:

* A git repository stores a bunch of RST files
* A script auto-generates index files based on the presence of certain
file names, and renders via rst2html
* The HTML pages are served as static content

MediaWiki is like this:

* Each page has a history, represented by a series of state snapshots
of wikitext
* On display, the wikitext is converted to HTML and served.

The main difference is that MediaWiki is optimized for rapid and
constant editing, where what I'm pushing for is optimized for less
common edits that might span multiple files. (MW has no facility for
atomically changing multiple pages, and atomically reverting those
changes, and so on. Each page stands alone.) They're still broadly
doing the same thing: storing marked-up text and rendering HTML. The
fact that one uses an SQL database and the other uses a git repository
is actually quite insignificant - it's as significant as the choice of
whether to store your data on a hard disk or an SSD. The system is no
different.

>> MediaWiki uses an SQL database to store that lump of text, but
>> ultimately the relationship is between wikitext and HTML, no SQL
>> involvement.
>
> Dunno what you mean. Every time someone browses wikipedia, things are
> getting pulled out of the SQL and munged into the html (s)he sees.

Yes, but that's just mechanics. The fact that the PHP scripts to
operate Wikipedia are being pulled off a file system doesn't mean that
MediaWiki is an ext3-to-HTML renderer. It's a wikitext-to-HTML
renderer.

Anyway. As I said, your point is still mostly there, as long as you
use wikitext rather than SQL.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Managing Google Groups headaches

2013-12-06 Thread rusi
On Friday, December 6, 2013 7:18:19 PM UTC+5:30, Chris Angelico wrote:
> On Sat, Dec 7, 2013 at 12:32 AM, rusi  wrote:
> > I guess we are using 'structured' in different ways.  All I am saying
> > is that mediawiki which seems to present as html, actually stores its
> > stuff as SQL -- nothing more or less structured than the schemas here:
> > http://www.mediawiki.org/wiki/Manual:MediaWiki_architecture#Database_and_text_storage

> Yeah, but the structure is all about the metadata.

Ok (I'd drop the 'all')

> Ultimately, there's one single text field containing the entire content

Right

> as you would see it in the page editor: wiki markup in straight text.

Aha! There you are! Its 'page editor' here and not the html which
'display source' (control-u) which a browser would show. And wikimedia
is the software that mediates.

The usual direction (seen by users of wikipedia) is that wikimedia
takes this text, along with the other unrelated (metadata?) seen
around -- sidebar, tabs etc, css settings and munges it all into html

The other direction (seen by editors of wikipedia) is that you edit a
page and that page and history etc will show the changes,
reflecting the fact that the SQL content has changed.

> MediaWiki uses an SQL database to store that lump of text, but
> ultimately the relationship is between wikitext and HTML, no SQL
> involvement.


Dunno what you mean. Every time someone browses wikipedia, things are
getting pulled out of the SQL and munged into the html (s)he sees.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Why is there no natural syntax for accessing attributes with names not being valid identifiers?

2013-12-06 Thread Neil Cerutti
On 2013-12-04, Piotr Dobrogost
 wrote:
> On Wednesday, December 4, 2013 10:41:49 PM UTC+1, Neil Cerutti
> wrote:
>> not something to do commonly. Your proposed syntax leaves the
>> distinction between valid and invalid identifiers a problem
>> the programmer has to deal with. It doesn't unify access to
>> attributes the way the getattr and setattr do.
>
> Taking into account that obj.'x' would be equivalent to obj.x
> any attribute can be accessed with the new syntax. I don't see
> how this is not unified access compared to using getattr
> instead dot...

I thought of that argument later the next day. Your proposal does
unify access if the old obj.x syntax is removed.

-- 
Neil Cerutti

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Embedding multiple interpreters

2013-12-06 Thread Garthy


Hi Tim,

On 06/12/13 20:47, Tim Golden wrote:

On 06/12/2013 09:27, Chris Angelico wrote:

On Fri, Dec 6, 2013 at 7:21 PM, Garthy
  wrote:

PS. Apologies if any of these messages come through more than once. Most
lists that I've posted to set reply-to meaning a normal reply can be used,
but python-list does not seem to. The replies I have sent manually to
python-list@python.org instead don't seem to have appeared. I'm not quite
sure what is happening- apologies for any blundering around on my part
trying to figure it out.


They are coming through more than once. If you're subscribed to the
list, sending to python-list@python.org should be all you need to do -
where else are they going?



I released a batch from the moderation queue from Garthy first thing
this [my] morning -- ie about 1.5 hours ago. I'm afraid I didn't check
first as to whether they'd already got through to the list some other way.


I had to make a call between re-sending posts that might have gone 
missing, or seemingly not responding promptly when people had taken the 
time to answer my complex query. I made a call to re-send, and it was 
the wrong one. The fault for the double-posting is entirely mine.


Cheers,
Garth
--
https://mail.python.org/mailman/listinfo/python-list


Re: [newbie] problem trying out simple non object oriented use of Tkinter

2013-12-06 Thread Jean-Michel Pichavant
> I tried out your suggestions and discovered that I had the line
> import sys to the program. So you can see below what I came up with.
> It works but it's not all clear to me. Can you tell me what
> "label.bind("<1>", quit)" is standing for? What's the <1> meaning?
> 
> 
> 
> #!/usr/bin/env python
> import Tkinter as tk
> import sys
> #underscore is necessary in the following line
> def quit(_):
> sys.exit()
> root = tk.Tk()
> label = tk.Label(root, text="Click mouse here to quit")
> label.pack()
> label.bind("<1>", quit)
> root.mainloop()
> 
> thanks
> jean

The best thing to do would be to read
http://effbot.org/tkinterbook/tkinter-events-and-bindings.htm


"<1>" is the identifier for you mouse button 1.
quit is the callback called by the label upon receiving the event mouse1 click.

Note that the parameter given to your quit callback is the event.

JM




-- IMPORTANT NOTICE: 

The contents of this email and any attachments are confidential and may also be 
privileged. If you are not the intended recipient, please notify the sender 
immediately and do not disclose the contents to any other person, use it for 
any purpose, or store or copy the information in any medium. Thank you.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Embedding multiple interpreters

2013-12-06 Thread Garthy


Hi Chris,

On 06/12/13 22:27, Chris Angelico wrote:
> On Fri, Dec 6, 2013 at 8:35 PM, Garthy
>   wrote:
>> I think the ideal is completely sandboxed, but it's something that I
>> understand I may need to make compromises on. The bare minimum would be
>> protection against inadvertent interaction. Better yet would be a 
setup that
>> made such interaction annoyingly difficult, and the ideal would be 
where it

>> was impossible to interfere.
>
> In Python, "impossible to interfere" is a pipe dream. There's no way
> to stop Python from fiddling around with the file system, and if
> ctypes is available, with memory in the running program. The only way
> to engineer that kind of protection is to prevent _the whole process_
> from doing those things (using OS features, not Python features),
> hence the need to split the code out into another process (which might
> be chrooted, might be running as a user with no privileges, etc).

Absolutely- it would be an impractical ideal. If it was my highest and 
only priority, CPython might not be the best place to start. But there 
are plenty of other factors that make Python very desirable to use 
regardless. :) Re file and ctype-style functionality, that is something 
I'm going to have to find a way to limit somewhat. But first things 
first: I need to see what I can accomplish re initial embedding with a 
reasonable amount of work.


> A setup that makes such interaction "annoyingly difficult" is possible
> as long as your users don't think Ruby. For instance:
>
> # script1.py
> import sys
> sys.stdout = open("logfile", "w")
> while True: print("Blah blah")
>
> # script2.py
> import sys
> sys.stdout = open("otherlogfile", "w")
> while True: print("Bleh bleh")
>
>
> These two scripts won't play nicely together, because each has
> modified global state in a different module. So you'd have to set that
> as a rule. (For this specific example, you probably want to capture
> stdout/stderr to some sort of global log file anyway, and/or use the
> logging module, but it makes a simple example.)

Thanks for the example. Hopefully I can minimise the cases where this 
would potentially be a problem. Modifying the basic environment and the 
source is something I can do readily if needed.


Re stdout/stderr, on that subject I actually wrote a replacement log 
catcher for embedded Python a few years back. I can't remember how on 
earth I did it now, but I've still got the code that did it somewhere.


> Most Python scripts
> aren't going to do this sort of thing, or if they do, will do very
> little of it. Monkey-patching other people's code is a VERY rare thing
> in Python.

That's good to hear. :)

>> The closest analogy for understanding would be browser plugins: 
Scripts from
>> multiple authors who for the most part aren't looking to create 
deliberate
>> incompatibilities or interference between plugins. The isolation is 
basic,
>> and some effort is made to make sure that one plugin can't cripple 
another

>> trivially, but the protection is not exhaustive.
>
> Browser plugins probably need a lot more protection - maybe it's not
> exhaustive, but any time someone finds a way for one plugin to affect
> another, the plugin / browser authors are going to treat it as a bug.
> If I understand you, though, this is more akin to having two forms on
> one page and having JS validation code for each. It's trivially easy
> for one to check the other's form objects, but quite simple to avoid
> too, so for the sake of encapsulation you simply stay safe.

There have been cases where browser plugins have played funny games to 
mess with the behaviour of other plugins (eg. one plugin removing 
entries from the configuration of another). It's certainly not ideal, 
but it comes from the environment being not entirely locked down, and 
one plugin author being inclined enough to make destructive changes that 
impact another. I think the right effort/reward ratio will mean I end up 
in a similar place.


I know it's not the best analogy, but it was one that readily came to 
mind. :)


>> With the single interpreter and multiple thread approach suggested, 
do you
>> know if this will work with threads created externally to Python, 
ie. if I

>> can create a thread in my application as normal, and then call something
>> like PyGILState_Ensure() to make sure that Python has the internals 
it needs
>> to work with it, and then use the GIL (or similar) to ensure that 
accesses

>> to it remain thread-safe?
>
> Now that's something I can't help with. The only time I embedded
> Python seriously was a one-Python-per-process system (arbitrary number
> of processes fork()ed from one master, but each process had exactly
> one Python environment and exactly one database connection, etc), and
> I ended up being unable to make it secure, so I had to switch to
> embedding ECMAScript (V8, specifically, as it happens... I'm morbidly
> curious what my boss plans to do, now that he's fired me; he hinted at
> rewri

Re: Sharing Python installation between architectures

2013-12-06 Thread Albert van der Horst
In article ,
Paul Smith   wrote:
>One thing I always liked about Perl was the way you can create a single
>installation directory which can be shared between archictures.  Say
>what you will about the language: the Porters have an enormous amount of
>experience and expertise producing portable and flexible interpreter
>installations.
>
>By this I mean, basically, multiple architectures (Linux, Solaris,
>MacOSX, even Windows) sharing the same $prefix/lib/python2.7 directory.
>The large majority of the contents there are completely portable across
>architectures (aren't they?) so why should I have to duplicate many
>megabytes worth of files?

The solution is of course to replace all duplicates by hard links.
A tool for this is useful in a lot of other circumstances too.
In a re-installation of the whole or parts, the hard links
will be removed, and the actual files are only removed if they aren't needed
for any of the installations, so this is transparent for reinstallation.
After a lot of reinstallation you want to run the tool again.

This is of course only possible on real file systems (probably not on FAT),
but your files reside on a server, so chances are they are on a real file
system.

(The above is partly in jest. It is a real solution to storage problems,
but storage problems are unheard of in these days of Tera byte disks.
It doesn't help with the clutter, which was probably the main motivation.)

Symbolic links are not as transparent, but they may work very well too.
Have the common part set apart and replace everything else by symbolic links.

There is always one more way to skin a cat.

Groetjes Albert
-- 
Albert van der Horst, UTRECHT,THE NETHERLANDS
Economic growth -- being exponential -- ultimately falters.
albert@spe&ar&c.xs4all.nl &=n http://home.hccnet.nl/a.w.m.van.der.horst

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Managing Google Groups headaches

2013-12-06 Thread Chris Angelico
On Sat, Dec 7, 2013 at 12:32 AM, rusi  wrote:
> I guess we are using 'structured' in different ways.  All I am saying
> is that mediawiki which seems to present as html, actually stores its
> stuff as SQL -- nothing more or less structured than the schemas here:
> http://www.mediawiki.org/wiki/Manual:MediaWiki_architecture#Database_and_text_storage

Yeah, but the structure is all about the metadata. Ultimately, there's
one single text field containing the entire content as you would see
it in the page editor: wiki markup in straight text. MediaWiki uses an
SQL database to store that lump of text, but ultimately the
relationship is between wikitext and HTML, no SQL involvement.

Wiki markup is reasonable for text structuring. (Not for generic data
structuring, but it's decent for text.) Same with reStructuredText,
used for PEPs. An SQL database is a good way to store mappings of
"this key, this tuple of data" and retrieve them conveniently,
including (and this is the bit that's more complicated in a straight
Python dictionary) using any value out of the tuple as the key, and
(and this is where a dict *really* can't hack it) storing/retrieving
more data than fits in memory. The two are orthogonal. Your point is
better supported by wikitext than by SQL, here, except that there
aren't fifty other systems that parse and display wikitext. In fact,
what you're suggesting is a good argument for deprecating HTML email
in favour of RST email, and using docutils to render the result either
as HTML (for webmail users) or as some other format. And I wouldn't be
against that :) But good luck convincing the world that Microsoft
Outlook is doing the wrong thing.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Managing Google Groups headaches

2013-12-06 Thread rusi
On Friday, December 6, 2013 6:49:04 PM UTC+5:30, Chris Angelico wrote:
> On Sat, Dec 7, 2013 at 12:03 AM, rusi wrote:
> > SQL databases (assuming thats the mediawiki backend) is another -- ok for
> > data-structuring bad for presentation.

> No, SQL databases don't store structured text. MediaWiki just stores a
> single blob (not in the database sense of that word) of text.

I guess we are using 'structured' in different ways.  All I am saying
is that mediawiki which seems to present as html, actually stores its
stuff as SQL -- nothing more or less structured than the schemas here:
http://www.mediawiki.org/wiki/Manual:MediaWiki_architecture#Database_and_text_storage
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Managing Google Groups headaches

2013-12-06 Thread Chris Angelico
On Sat, Dec 7, 2013 at 12:03 AM, rusi  wrote:
> SQL databases (assuming thats the mediawiki backend) is another -- ok for
> data-structuring bad for presentation.

No, SQL databases don't store structured text. MediaWiki just stores a
single blob (not in the database sense of that word) of text.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: [newbie] problem trying out simple non object oriented use of Tkinter

2013-12-06 Thread Jean Dubois
Op vrijdag 6 december 2013 13:30:53 UTC+1 schreef Daniel Watkins:
> Hi Jean,
> 
> 
> 
> On Fri, Dec 06, 2013 at 04:24:59AM -0800, Jean Dubois wrote:
> 
> > I'm trying out Tkinter with the (non object oriented) code fragment below:
> 
> > It works partially as I expected, but I thought that pressing "1" would
> 
> > cause the program to quit, however I get this message:
> 
> > TypeError: quit() takes no arguments (1 given), I tried changing quit to 
> > quit()
> 
> > but that makes things even worse. So my question: can anyone here help me
> 
> > debug this?
> 
> 
> 
> I don't know the details of the Tkinter library, but you could find out
> 
> what quit is being passed by modifying it to take a single parameter and
> 
> printing it out (or using pdb):
> 
> 
> 
> def quit(param):
> 
> print(param)
> 
> sys.exit()
> 
> 
> 
> Having taken a quick look at the documentation, it looks like event
> 
> handlers (like your quit function) are passed the event that triggered
> 
> them.  So you can probably just ignore the parameter:
> 
> 
> 
> def quit(_):
> 
> sys.exit()
> 
> 
> 
> 
> 
> Cheers,
> 
> 
> 
> Dan

I tried out your suggestions and discovered that I had the line
import sys to the program. So you can see below what I came up with.
It works but it's not all clear to me. Can you tell me what "label.bind("<1>", 
quit)" is standing for? What's the <1> meaning?



#!/usr/bin/env python
import Tkinter as tk
import sys
#underscore is necessary in the following line
def quit(_):
sys.exit()
root = tk.Tk()
label = tk.Label(root, text="Click mouse here to quit")
label.pack()
label.bind("<1>", quit)
root.mainloop()

thanks
jean


-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Managing Google Groups headaches

2013-12-06 Thread rusi
On Friday, December 6, 2013 1:06:30 PM UTC+5:30, Roy Smith wrote:
>  Rusi  wrote:

> > On Thursday, December 5, 2013 6:28:54 AM UTC+5:30, Roy Smith wrote:

> > > The real problem with web forums is they conflate transport and 
> > > presentation into a single opaque blob, and are pretty much universally 
> > > designed to be a closed system.  Mail and usenet were both engineered to 
> > > make a sharp division between transport and presentation, which meant it 
> > > was possible to evolve each at their own pace.
> > > Mostly that meant people could go off and develop new client 
> > > applications which interoperated with the existing system.  But, it also 
> > > meant that transport layers could be switched out (as when NNTP 
> > > gradually, but inexorably, replaced UUCP as the primary usenet transport 
> > > layer).
> > There is a deep assumption hovering round-about the above -- what I
> > will call the 'Unix assumption(s)'.

> It has nothing to do with Unix.  The separation of transport from 
> presentation is just as valid on Windows, Mac, etc.

> > But before that, just a check on
> > terminology. By 'presentation' you mean what people normally call
> > 'mail-clients': thunderbird, mutt etc. And by 'transport' you mean
> > sendmail, exim, qmail etc etc -- what normally are called
> > 'mail-servers.'  Right??

> Yes.

> > Assuming this is the intended meaning of the terminology (yeah its
> > clearer terminology than the usual and yeah Im also a 'Unix-guy'),
> > here's the 'Unix-assumption':
> >   - human communication�
> > (is not very different from)
> >   - machine communication�
> > (can be done by)
> >   - text�
> > (for which)
> >   - ASCII is fine�
> > (which is just)
> >   - bytes�
> > (inside/between byte-memory-organized)
> >   - von Neumann computers
> > To the extent that these assumptions are invalid, the 'opaque-blob'
> > may well be preferable.

> I think you're off on the wrong track here.  This has nothing to do with 
> plain text (ascii or otherwise).  It has to do with divorcing how you 
> store and transport messages (be they plain text, HTML, or whatever) 
> from how a user interacts with them.


Evidently (and completely inadvertently) this exchange has just
illustrated one of the inadmissable assumptions:

"unicode as a medium is universal in the same way that ASCII used to be"

I wrote a number of ellipsis characters ie codepoint 2026 as in:

  - human communication…
(is not very different from)
  - machine communication… 

Somewhere between my sending and your quoting those ellipses became
the replacement character FFFD

> >   - human communication�
> > (is not very different from)
> >   - machine communication�

Leaving aside whose fault this is (very likely buggy google groups),
this mojibaking cannot happen if the assumption "All text is ASCII"
were to uniformly hold.

Of course with unicode also this can be made to not happen, but that
is fragile and error-prone.  And that is because ASCII (not extended)
is ONE thing in a way that unicode is hopelessly a motley inconsistent
variety.

With unicode there are in-memory formats, transportation formats eg
UTF-8, strange beasties like FSR (which then hopelessly and
inveterately tickle our resident trolls!) multi-layer encodings (in
html), BOMS and unnecessary/inconsistent BOMS (in microsoft-notepad).
With ASCII, ASCII is ASCII; ie "ABC" is 65,66,67 whether its in-core,
in-file, in-pipe or whatever.  Ok there are a few wrinkles to this
eg. the null-terminator in C-strings. I think this is the exception to
the rule that in classic Unix, ASCII is completely inter-operable and
therefore a universal data-structure for inter-process or inter-machine
communication.

It is this universal data structure that makes classic unix pipes and
filters possible and easy (of which your separation of presentation
and transportation is just one case).

Give it up and the composability goes with it.

Go up from the ASCII -> Unicode level to the plain-text -> hypertext
(aka html) level and these composability problems hit with redoubled
force.

> Take something like Wikipedia (by which, I really mean, MediaWiki, which 
> is the underlying software package).  Most people think of Wikipedia as 
> a web site.  But, there's another layer below that which lets you get 
> access to the contents of articles, navigate all the rich connections 
> like category trees, and all sorts of metadata like edit histories.  
> Which means, if I wanted to (and many examples of this exist), I can 
> write my own client which presents the same information in different 
> ways.

Not sure whats your point.
Html is a universal data-structuring format -- ok for presentation, bad for
data-structuring
SQL databases (assuming thats the mediawiki backend) is another -- ok for 
data-structuring bad for presentation.

Mediawiki mediates between the two formats.

Beyond that I lost you... what are you trying to say??
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: [newbie] problem trying out simple non object oriented use of Tkinter

2013-12-06 Thread Daniel Watkins
Hi Jean,

On Fri, Dec 06, 2013 at 04:24:59AM -0800, Jean Dubois wrote:
> I'm trying out Tkinter with the (non object oriented) code fragment below:
> It works partially as I expected, but I thought that pressing "1" would
> cause the program to quit, however I get this message:
> TypeError: quit() takes no arguments (1 given), I tried changing quit to 
> quit()
> but that makes things even worse. So my question: can anyone here help me
> debug this?

I don't know the details of the Tkinter library, but you could find out
what quit is being passed by modifying it to take a single parameter and
printing it out (or using pdb):

def quit(param):
print(param)
sys.exit()

Having taken a quick look at the documentation, it looks like event
handlers (like your quit function) are passed the event that triggered
them.  So you can probably just ignore the parameter:

def quit(_):
sys.exit()


Cheers,

Dan
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: [newbie] problem trying out simple non object oriented use of Tkinter

2013-12-06 Thread Jean-Michel Pichavant
- Original Message -
> I'm trying out Tkinter with the (non object oriented) code fragment
> below:
> It works partially as I expected, but I thought that pressing "1"
> would
> cause the program to quit, however I get this message:
> TypeError: quit() takes no arguments (1 given), I tried changing quit
> to quit()
> but that makes things even worse. So my question: can anyone here
> help me
> debug this?
> 
> #!/usr/bin/env python
> import Tkinter as tk
> def quit():
> sys.exit()
> root = tk.Tk()
> label = tk.Label(root, text="Hello, world")
> label.pack()
> label.bind("<1>", quit)
> root.mainloop()
> 
> p.s. I like the code not object orientated
> --
> https://mail.python.org/mailman/listinfo/python-list
> 

the engine is probably passing an argument to your quit callback method.

try  

def quit(param):
  sys.exit(str(param))

You probably don't even care about the parameter:

def quit(param):
  sys.exit()

JM


-- IMPORTANT NOTICE: 

The contents of this email and any attachments are confidential and may also be 
privileged. If you are not the intended recipient, please notify the sender 
immediately and do not disclose the contents to any other person, use it for 
any purpose, or store or copy the information in any medium. Thank you.
-- 
https://mail.python.org/mailman/listinfo/python-list


[newbie] problem trying out simple non object oriented use of Tkinter

2013-12-06 Thread Jean Dubois
I'm trying out Tkinter with the (non object oriented) code fragment below:
It works partially as I expected, but I thought that pressing "1" would
cause the program to quit, however I get this message:
TypeError: quit() takes no arguments (1 given), I tried changing quit to quit()
but that makes things even worse. So my question: can anyone here help me
debug this?

#!/usr/bin/env python
import Tkinter as tk
def quit():
sys.exit()
root = tk.Tk()
label = tk.Label(root, text="Hello, world")
label.pack()
label.bind("<1>", quit)
root.mainloop()

p.s. I like the code not object orientated
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: squeeze out some performance

2013-12-06 Thread Chris Angelico
On Fri, Dec 6, 2013 at 8:46 PM, Jeremy Sanders  wrote:
> This sort of code is probably harder to make faster in pure python. You
> could try profiling it to see where the hot spots are. Perhaps the choice of
> arrays or sets might have some speed impact.

I'd make this recommendation MUCH stronger.

Rule 1 of optimization: Don't.
Rule 2 (for experts only): Don't yet.

Once you find that your program actually is running too slowly, then
AND ONLY THEN do you start looking at tightening something up. You'll
be amazed how little you need to change; start with good clean
idiomatic code, and then if it takes too long, you tweak just a couple
of things and it's fast enough. And when you do come to the
tweaking...

Rule 3: Measure twice, cut once.
Rule 4: Actually, measure twenty times, cut once.

Profile your code to find out what's actually slow. This is very
important. Here's an example from a real application (not in Python,
it's in a semantically-similar language called Pike):

https://github.com/Rosuav/Gypsum/blob/d9907e1507c52189c83ae25f5d7be85235b616fa/window.pike

I noticed that I could saturate one CPU core by typing commands very
quickly. Okay. That gets us past the first two rules (it's a MUD
client, it should not be able to saturate one core of an i5). The code
looks roughly like this:

paint():
for line in lines:
if line_is_visible:
paint_line(line)

paint_line():
for piece_of_text in text:
if highlighted: draw_highlighted()
else: draw_not_highlighted()

My first guess was that the actual drawing was taking the time, since
that's a whole lot of GTK calls. But no; the actual problem was the
iteration across all lines and then finding out if they're visible or
not (possibly because it obliterates the CPU caches). Once the
scrollback got to a million lines or so, that was prohibitively
expensive. I didn't realize that until I actually profiled the code
and _measured_ where the time was being spent.

How fast does your code run? How fast do you need it to run? Lots of
optimization questions are answered by "Yaknow what, it don't even
matter", unless you're running in a tight loop, or on a
microcontroller, or something. Halving the time taken sounds great
until you see that it's currently taking 0.0001 seconds and happens in
response to user action.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Embedding multiple interpreters

2013-12-06 Thread Chris Angelico
On Fri, Dec 6, 2013 at 8:35 PM, Garthy
 wrote:
> I think the ideal is completely sandboxed, but it's something that I
> understand I may need to make compromises on. The bare minimum would be
> protection against inadvertent interaction. Better yet would be a setup that
> made such interaction annoyingly difficult, and the ideal would be where it
> was impossible to interfere.

In Python, "impossible to interfere" is a pipe dream. There's no way
to stop Python from fiddling around with the file system, and if
ctypes is available, with memory in the running program. The only way
to engineer that kind of protection is to prevent _the whole process_
from doing those things (using OS features, not Python features),
hence the need to split the code out into another process (which might
be chrooted, might be running as a user with no privileges, etc).

A setup that makes such interaction "annoyingly difficult" is possible
as long as your users don't think Ruby. For instance:

# script1.py
import sys
sys.stdout = open("logfile", "w")
while True: print("Blah blah")

# script2.py
import sys
sys.stdout = open("otherlogfile", "w")
while True: print("Bleh bleh")


These two scripts won't play nicely together, because each has
modified global state in a different module. So you'd have to set that
as a rule. (For this specific example, you probably want to capture
stdout/stderr to some sort of global log file anyway, and/or use the
logging module, but it makes a simple example.) Most Python scripts
aren't going to do this sort of thing, or if they do, will do very
little of it. Monkey-patching other people's code is a VERY rare thing
in Python.

> The closest analogy for understanding would be browser plugins: Scripts from
> multiple authors who for the most part aren't looking to create deliberate
> incompatibilities or interference between plugins. The isolation is basic,
> and some effort is made to make sure that one plugin can't cripple another
> trivially, but the protection is not exhaustive.

Browser plugins probably need a lot more protection - maybe it's not
exhaustive, but any time someone finds a way for one plugin to affect
another, the plugin / browser authors are going to treat it as a bug.
If I understand you, though, this is more akin to having two forms on
one page and having JS validation code for each. It's trivially easy
for one to check the other's form objects, but quite simple to avoid
too, so for the sake of encapsulation you simply stay safe.

> With the single interpreter and multiple thread approach suggested, do you
> know if this will work with threads created externally to Python, ie. if I
> can create a thread in my application as normal, and then call something
> like PyGILState_Ensure() to make sure that Python has the internals it needs
> to work with it, and then use the GIL (or similar) to ensure that accesses
> to it remain thread-safe?

Now that's something I can't help with. The only time I embedded
Python seriously was a one-Python-per-process system (arbitrary number
of processes fork()ed from one master, but each process had exactly
one Python environment and exactly one database connection, etc), and
I ended up being unable to make it secure, so I had to switch to
embedding ECMAScript (V8, specifically, as it happens... I'm morbidly
curious what my boss plans to do, now that he's fired me; he hinted at
rewriting the C++ engine in PHP, and I'd love to be a fly on the wall
as he tries to test a PHP extension for V8 and figure out whether or
not he can trust arbitrary third-party compiled code). But there'll be
someone on this list who's done threads and embedded Python.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: using ffmpeg command line with python's subprocess module

2013-12-06 Thread Ned Batchelder

On 12/6/13 4:23 AM, Mark Lawrence wrote:

On 06/12/2013 06:23, iMath wrote:

Dearest iMath, wouldst thou be kind enough to partake of obtaining some
type of email client that dost not sendeth double spaced data into this
most illustrious of mailing lists/newsgroups.  Thanking thee for thine
participation in my most humble of requests.  I do remain your most
obedient servant.



iMath seems to be a native Chinese speaker.  I think this message, 
though amusing, will be baffling and won't have any effect...


--Ned.

--
https://mail.python.org/mailman/listinfo/python-list


Re: Embedding multiple interpreters

2013-12-06 Thread Tim Golden
On 06/12/2013 09:27, Chris Angelico wrote:
> On Fri, Dec 6, 2013 at 7:21 PM, Garthy
>  wrote:
>> PS. Apologies if any of these messages come through more than once. Most
>> lists that I've posted to set reply-to meaning a normal reply can be used,
>> but python-list does not seem to. The replies I have sent manually to
>> python-list@python.org instead don't seem to have appeared. I'm not quite
>> sure what is happening- apologies for any blundering around on my part
>> trying to figure it out.
> 
> They are coming through more than once. If you're subscribed to the
> list, sending to python-list@python.org should be all you need to do -
> where else are they going?


I released a batch from the moderation queue from Garthy first thing
this [my] morning -- ie about 1.5 hours ago. I'm afraid I didn't check
first as to whether they'd already got through to the list some other way.

TJG


-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Embedding multiple interpreters

2013-12-06 Thread Garthy


Hi Chris,

On 06/12/13 19:57, Chris Angelico wrote:
> On Fri, Dec 6, 2013 at 7:21 PM, Garthy
>   wrote:
>> PS. Apologies if any of these messages come through more than once. Most
>> lists that I've posted to set reply-to meaning a normal reply can be 
used,

>> but python-list does not seem to. The replies I have sent manually to
>> python-list@python.org instead don't seem to have appeared. I'm not 
quite

>> sure what is happening- apologies for any blundering around on my part
>> trying to figure it out.
>
> They are coming through more than once. If you're subscribed to the
> list, sending to python-list@python.org should be all you need to do -
> where else are they going?

I think I've got myself sorted out now. The mailing list settings are a 
bit different from what I am used to and I just need to reply to 
messages differently than I normally do.


First attempt for three emails each went to the wrong place, second 
attempt for each appeared to have disappeared into the ether and I 
assumed non-delivery, but I was incorrect and they all actually arrived 
along with my third attempt at each.


Apologies to all for the inadvertent noise.

Cheers,
Garth
--
https://mail.python.org/mailman/listinfo/python-list


  1   2   >