from:"Bruce"

[issue47149] DatagramHandler doing DNS lookup on every log message

2022-03-30 Thread Bruce Merry



Bruce Merry  added the comment:

> But it's going to be non-trivial, I fear.

Yeah. Maybe some documentation is achievable in the short term though, so that 
users who care more about latency than changing DNS are aware that they should 
do the lookup themselves?

--

___
Python tracker 
<https://bugs.python.org/issue47149>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue47149] DatagramHandler doing DNS lookup on every log message

2022-03-29 Thread Bruce Merry



Bruce Merry  added the comment:

> Hmm. I'm not sure we should try to work around a bad resolver issue. What's 
> your platform, and how did you install Python?

Fair point. It's Ubuntu 20.04, running inside Docker, with the default Python 
(3.8). I've also reproduced it outside Docker (again Ubuntu 20.04 with system 
Python). The TTL is 30s, so I'm not sure why systemd-resolved isn't caching it 
for messages logged several times a second.

Even if the system has a local cache though, it's not ideal to have logging 
block when the TTL expires, particularly in an event-driven (asyncio) service. 
Updating the address in a background thread while continuing to log to the old 
address might be better. But my use case is particularly real-time (even 10ms 
of latency is problematic), and maybe that shouldn't drive the default 
behaviour.

I blame the lack of standard POSIX functions for doing DNS lookups 
asynchronously and in a way that provides TTL information to the client.

--

___
Python tracker 
<https://bugs.python.org/issue47149>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue47149] DatagramHandler doing DNS lookup on every log message

2022-03-29 Thread Bruce Merry



Bruce Merry  added the comment:

> Yes, that's what I mean. Isn't the resolver library smart enough to cache 
> lookups and handle the TTL timeout by itself?

Apparently not in this case - with tcpdump I can see the DNS requests being 
fired off several times a second. I'll need to check what the TTL actually is 
though.

--

___
Python tracker 
<https://bugs.python.org/issue47149>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue47149] DatagramHandler doing DNS lookup on every log message

2022-03-29 Thread Bruce Merry


Bruce Merry  added the comment:

> If you don’t look it up every time, how do you deal with DNS timeouts?

Do you mean expiring the IP address when the TTL is reached? I suppose that 
could be an issue for a long-running service, and I don't have a good answer to 
that. Possibly these days with services meshes and load-balancers it is less of 
a concern since a logging server can move without changing its IP address.

But it's important for a logging system not to block the service doing the 
logging (which is one reason for using UDP in the first place). I only 
discovered this issue because of some flaky DNS servers that would occasionally 
take several seconds to answer a query, and block the whole asyncio event loop 
while it waited.

At a minimum it would be useful to document it, so that you know it's something 
to be concerned about when using DatagramHandler.

--

___
Python tracker 
<https://bugs.python.org/issue47149>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue47149] DatagramHandler doing DNS lookup on every log message

2022-03-29 Thread Bruce Merry



Change by Bruce Merry :


--
type:  -> performance

___
Python tracker 
<https://bugs.python.org/issue47149>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue47149] DatagramHandler doing DNS lookup on every log message

2022-03-29 Thread Bruce Merry



New submission from Bruce Merry :

logging.DatagramHandler uses socket.sendto to send the messages. If the given 
address is a hostname rather than an IP address, it will do a DNS lookup every 
time.

I suspect that fixing issue 14855 will also fix this, since fixing that issue 
requires resolving the hostname to determine whether it is an IPv4 or IPv6 
address to create a suitable socket.

I've run into this on 3.8, but tagging 3.10 since the code still looks the same.

--
components: Library (Lib)
messages: 416247
nosy: bmerry
priority: normal
severity: normal
status: open
title: DatagramHandler doing DNS lookup on every log message
versions: Python 3.10

___
Python tracker 
<https://bugs.python.org/issue47149>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue46693] dataclass generated str does not use overridden member str

2022-02-09 Thread Bruce Eckel



Bruce Eckel  added the comment:

Oops. That does in fact work. How do I remove the bug report?

*Bruce Eckel*
HappyPathProgramming.com
SummerTechForum.com
MindViewLLC.com
Blog: BruceEckel.com
EvolveWork.co
WinterTechForum.com <http://www.WinterTechForum.com>
OnJava8.com <http://www.OnJava8.com>
www.AtomicKotlin.com
Reinventing-Business.com <http://www.Reinventing-Business.com>

On Wed, Feb 9, 2022 at 10:20 AM Eric V. Smith 
wrote:

>
> Eric V. Smith  added the comment:
>
> I believe dataclasses uses repr() of the members, not str(). Can you try
> using specifying __repr__ in Teacup? Just __repr__ = __str__ should work.
>
> --
> nosy: +eric.smith
>
> ___
> Python tracker 
> <https://bugs.python.org/issue46693>
> ___
>

--

___
Python tracker 
<https://bugs.python.org/issue46693>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue46693] dataclass generated str does not use overridden member str

2022-02-09 Thread Bruce Eckel



New submission from Bruce Eckel :

When creating a dataclass using members of other classes that have overridden 
their __str__ methods, the __str__ method synthesized by the dataclass ignores 
the overridden __str__ methods in its component members.

Demonstrated in attached file.

--
components: Interpreter Core
files: DataClassStrBug.py
messages: 412927
nosy: Bruce Eckel
priority: normal
severity: normal
status: open
title: dataclass generated __str__ does not use overridden member __str__
type: behavior
versions: Python 3.10
Added file: https://bugs.python.org/file50611/DataClassStrBug.py

___
Python tracker 
<https://bugs.python.org/issue46693>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue21644] Optimize bytearray(int) constructor to use calloc()

2021-11-14 Thread Bruce Merry



Bruce Merry  added the comment:

> I abandonned the issue because I didn't have time to work on it. If you want, 
> you can open a new issue for that.

If I make a pull request and run some microbenchmarks, will you (or some other 
core dev) have time to review it? I've had a bad experience before with a PR 
that I'm still unable to get reviewed after several years, so I'd like to get 
at least a tentative agreement before I invest time in it.

--

___
Python tracker 
<https://bugs.python.org/issue21644>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue36050] Why does http.client.HTTPResponse._safe_read use MAXAMOUNT

2021-07-29 Thread Bruce Merry



Bruce Merry  added the comment:

> Will you accept patches to fix this for 3.9? I'm not clear whether the "bug 
> fixes only" status of 3.9 allows for fixing performance regressions.

Never mind, I see your already answered this on bpo-42853 (as a no). Thanks for 
taking the time to answer my questions; I'll just have to skip Python 3.9 for 
this particular application and go straight to 3.10.

--

___
Python tracker 
<https://bugs.python.org/issue36050>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue36050] Why does http.client.HTTPResponse._safe_read use MAXAMOUNT

2021-07-29 Thread Bruce Merry



Bruce Merry  added the comment:

> There is nothing to do here.

Will you accept patches to fix this for 3.9? I'm not clear whether the "bug 
fixes only" status of 3.9 allows for fixing performance regressions.

--

___
Python tracker 
<https://bugs.python.org/issue36050>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue42853] `OverflowError: signed integer is greater than maximum` in ssl.py for files larger than 2GB

2021-07-29 Thread Bruce Merry



Bruce Merry  added the comment:

> A patch would not land in Python 3.9 since this would be a new feature and 
> out-of-scope for a released version.

I see it as a fix for this bug. While there is already a fix, it regresses 
another bug (bpo-36050), so this would be a better fix.

> Do you really want to store gigabytes of downloads in RAM instead of doing 
> chunked reads and store them on disk?

I work on HPC applications where large quantities of data are stored in an 
S3-compatible object store and fetched over HTTP at 25Gb/s for processing. The 
data access layer tries very hard to avoid even making extra copies in memory 
(which is what caused me to file bpo-36050 in the first place) as it make a 
significant difference at those speeds. Buffering to disk would be right out.

> then there are easier and better ways to deal with large buffers

Your example code is probably fine if one is working directly on an SSLSocket, 
but http.client wraps it in a buffered reader (via `socket.makefile`), and that 
implements `readinto` by reading into a temporary and copying it 
(https://github.com/python/cpython/blob/8d0647485db5af2a0f0929d6509479ca45f1281b/Modules/_io/bufferedio.c#L88),
 which would add overhead.

I appreciate that what I'm proposing is a relatively complex change for a 
released version. A less intrusive option would to be change MAXAMOUNT in 
http.client from 1MiB to 2GiB-1byte (as suggested by @matan1008). That would 
still leave 3.9 slower than 3.8 when reading >2GiB responses over plain HTTP, 
but at least everything in the range [1MiB, 2GiB) would operate at full speed 
(which is the region I actually care about).

--

___
Python tracker 
<https://bugs.python.org/issue42853>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue42853] `OverflowError: signed integer is greater than maximum` in ssl.py for files larger than 2GB

2021-07-28 Thread Bruce Merry



Bruce Merry  added the comment:

> It seems like we could have support for OpenSSL 1.1.1 at that level with a 
> compile time fallback for previous OpenSSL versions that break up the work. 
> Would hope this solution also yields something we can backport more easily

I'd have to look at exactly how the SSL_read API works, but I think once we're 
in C land and can read into regions of a buffer, reading in 2GB chunks is 
unlikely to cause a performance hit (unlike the original bpo-36050, where 
Python had to read a bunch of separate buffers then join them together). So 
trying to have 3.9 support both SSL_read_ex AND have a fallback sounds like 
it's adding complexity and risking inconsistency if the fallback doesn't 
perfectly mimic the SSL_read_ex path, for very little gain.

If no-one else steps up sooner I can probably work on a patch, but before 
sinking time into it I'd like to hear if there is agreement that this is a 
reasonable approach and ideally have a volunteer to review it (hopefully 
someone who is familiar with OpenSSL, since I've only briefly dealt with it 
years ago and crypto isn't somewhere you want to make mistakes).

--

___
Python tracker 
<https://bugs.python.org/issue42853>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue42853] `OverflowError: signed integer is greater than maximum` in ssl.py for files larger than 2GB

2021-07-28 Thread Bruce Merry



Bruce Merry  added the comment:

This fix is going to cause a regression of bpo-36050. Would it not be possible 
to fix this in _ssl.c (by breaking a large read into multiple smaller calls to 
SSL_read)? It seems like fixing this at the SSL layer is more appropriate than 
trying to work around it at the HTTP layer, and thus impacting the performance 
of all HTTP fetches (whether using TLS or not, and whether >2GB or not).

--
nosy: +bmerry

___
Python tracker 
<https://bugs.python.org/issue42853>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue36050] Why does http.client.HTTPResponse._safe_read use MAXAMOUNT

2021-07-28 Thread Bruce Merry



Bruce Merry  added the comment:

Re-opening because the patch to fix this has just been reverted due to 
bpo-42853.

--
status: closed -> open

___
Python tracker 
<https://bugs.python.org/issue36050>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue43182] TURTLE: Default values for basic Turtle commands

2021-02-14 Thread Bruce Fuda



Bruce Fuda  added the comment:

Added turtle experts to nosy list

--
nosy: +gregorlingl, willingc

___
Python tracker 
<https://bugs.python.org/issue43182>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue43182] TURTLE: Default values for basic Turtle commands

2021-02-09 Thread Bruce Fuda



New submission from Bruce Fuda :

Since Python is being taught to students of all ages and in particular younger 
kids (i.e. 6 - 10 yo) who are learning not just python, but many related things 
for the first time, the need to include values as arguments to basic Turtle 
functions like:

forward()
backward()
left()
right()

adds additional complexity. The natural starting point for these types of 
activities with kids is a transition from the physical world of devices such as 
the Bee-bot 
(https://www.tts-international.com/bee-bot-programmable-floor-robot/1015268.html)
 which have pre-defined lengths (i.e. 15cm) and turn angles (i.e. 90 degrees) 
so that they can focus primarily on sequencing and learning commands without 
the cognitive load of additional information.

It would be an ideal starting point for the forward(), backward(), left() and 
right() commands to be overloaded such that there is a default value. This 
reduces the cognitive load for young students, because it instead means they 
can focus on learning basic commands, sequencing and syntax without also 
needing to understand angles and relative lengths.

Suggestion would be to set a default value for movement (i.e. forward and 
backward) of 50, and default angles for turning (left and right) of 90 degrees.

This simple (and trivial) change would massively help thousands of students get 
started with Python, without in any way impacting existing use of Turtle.

--
components: Library (Lib)
messages: 386750
nosy: Bruce1979
priority: normal
severity: normal
status: open
title: TURTLE: Default values for basic Turtle commands
type: enhancement
versions: Python 3.10, Python 3.6, Python 3.7, Python 3.8, Python 3.9

___
Python tracker 
<https://bugs.python.org/issue43182>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue21644] Optimize bytearray(int) constructor to use calloc()

2020-09-15 Thread Bruce Merry



Bruce Merry  added the comment:

Was this abandoned just because nobody had the time, or was there a problem 
with the approach? I independently wanted this optimisation, and have ended up 
implementing something very similar to what was reverted in 
https://hg.python.org/lookup/dff6b4b61cac.

In a benchmark that creates a large bytearray, then fills it with 
socket.readinto, I'm seeing a 2x performance improvement on Linux, and from 
some quick benchmarking it seems to be just as fast as the old code for small 
arrays that are allocated from the pool.

--
nosy: +bmerry

___
Python tracker 
<https://bugs.python.org/issue21644>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue32528] Change base class for futures.CancelledError

2020-07-06 Thread Bruce Merry



Bruce Merry  added the comment:

FYI this has just bitten me after updating my OS to one that ships Python 3.8. 
It is code that was written with asyncio cancellation in mind and which 
expected CancelledError to be caught with "except Exception" (the exception 
block unwound incomplete operations before re-raising the exception).

It's obviously too late to do anything about Python 3.8, but I'm mentioning 
this as a data point in support of having a deprecation period if similar 
changes are made in future.

On the plus side, while fixing up my code and checking all instances of "except 
Exception" I found some places where this change did fix latent cancellation 
bugs. So I'm happy with the change, just a little unhappy that it came as a 
surprise.

--
nosy: +bmerry

___
Python tracker 
<https://bugs.python.org/issue32528>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue41200] Add pickle.loads fuzz test

2020-07-03 Thread Bruce Day



New submission from Bruce Day :

add pickle.loads(x) fuzz test

--
components: Tests
messages: 372916
nosy: Bruce Day
priority: normal
pull_requests: 20438
severity: normal
status: open
title: Add pickle.loads fuzz test
type: security
versions: Python 3.10

___
Python tracker 
<https://bugs.python.org/issue41200>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue41002] HTTPResponse.read with amt is slow

2020-06-18 Thread Bruce Merry


Bruce Merry  added the comment:

> (perhaps 'MB/s's are wrong).

Why, are you getting significantly different results?

Just in case it's confusing, the results are reported as A ± B MB/s, where A is 
the mean and B is the standard deviation of the mean. So it's about 3GB/s when 
no length if passed, or 1GB/s when a length is passed.

--

___
Python tracker 
<https://bugs.python.org/issue41002>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue41002] HTTPResponse.read with amt is slow

2020-06-17 Thread Bruce Merry



Change by Bruce Merry :


--
keywords: +patch
pull_requests: +20124
stage:  -> patch review
pull_request: https://github.com/python/cpython/pull/20943

___
Python tracker 
<https://bugs.python.org/issue41002>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue41002] HTTPResponse.read with amt is slow

2020-06-17 Thread Bruce Merry



Change by Bruce Merry :


--
type:  -> performance

___
Python tracker 
<https://bugs.python.org/issue41002>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue41002] HTTPResponse.read with amt is slow

2020-06-17 Thread Bruce Merry


New submission from Bruce Merry :

I've run into this on 3.8, but the code on Git master doesn't look 
significantly different so I assume it still applies. I'm happy to work on a PR 
for this.

When http.client.HTTPResponse.read is called with a specific amount to read, it 
goes down this code path:
```
if amt is not None:
# Amount is given, implement using readinto
b = bytearray(amt)
n = self.readinto(b)
return memoryview(b)[:n].tobytes()
```
That's pretty inefficient, because
- `bytearray(amt)` will first zero-fill some memory
- `tobytes()` will make an extra copy of this memory
- if amt is big enough, it'll cause the temporary memory to be allocated from 
the kernel, which will *also* zero-fill the pages for security.

A better approach would be to use the read method of the underlying fp.

I have a micro-benchmark (that I'll attach) showing that for a 1GB body and 
reading the whole body with or without the amount being explicit, performance 
is reduced from 3GB/s to 1GB/s.

For some unknown reason the requests library likes to read the body in 10KB 
chunks even if the user has requested the entire body, so this will help here 
(although the gains probably won't be as big because 10KB is really too small 
to amortise all the accounting overhead).

Output from my benchmark, run against a 1GB file on localhost:

httpclient-read: 3019.0 ± 63.8 MB/s
httpclient-read-length: 1050.3 ± 4.8 MB/s
httpclient-read-raw: 3150.3 ± 5.3 MB/s
socket-read: 3134.4 ± 7.9 MB/s

--
components: Library (Lib)
files: httpbench-simple.py
messages: 371732
nosy: bmerry
priority: normal
severity: normal
status: open
title: HTTPResponse.read with amt is slow
versions: Python 3.8
Added file: https://bugs.python.org/file49239/httpbench-simple.py

___
Python tracker 
<https://bugs.python.org/issue41002>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue39974] A race condition with GIL releasing exists in stringlib_bytes_join

2020-03-19 Thread Bruce Merry



Bruce Merry  added the comment:

+tzickel I'd suggest reading the discussion in issue 36051, and maybe raising a 
new issue about it if you still have concerns. In short, dropping the GIL in 
more bytes.join cases wouldn't necessarily be wrong, but it might break code 
that made the assumption that bytes.join is atomic even though that's never 
been claimed.

--

___
Python tracker 
<https://bugs.python.org/issue39974>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue39974] A race condition with GIL releasing exists in stringlib_bytes_join

2020-03-16 Thread Bruce Merry



Bruce Merry  added the comment:

> static_buffers is not a static variable. It is auto local variable.
> So I think other thread don't hijack it.

Oh yes, quite right. I should have looked closer at the code first before 
commenting. I think this can be closed as not-a-bug, unless +tzickel has 
example code that gives the wrong output?

> perhaps add an if to check if the backing object is really mutable ? 
> (Py_buffer.readonly)

It's not just the buffer data being mutable that's an issue, it's the owning 
object. It's possible for an object to expose a read-only buffer, but also 
allow the buffer (including its size or address) to be mutated through its own 
API.

> Also, semi related, (dunno where to discuss it), would a good .join() 
> optimization be to add an optional length parameter, like .join(iterable, 
> length=10)

You could always open a separate bug for it, but I can't see it catching on 
given that one needs to modify one's code for it.

--

___
Python tracker 
<https://bugs.python.org/issue39974>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue39974] A race condition with GIL releasing exists in stringlib_bytes_join

2020-03-16 Thread Bruce Merry



Bruce Merry  added the comment:

Good catch! I'll take a look this week to see what makes sense for the use case 
for which I originally proposed this optimisation.

--

___
Python tracker 
<https://bugs.python.org/issue39974>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue39641] concatenation of Tuples

2020-02-16 Thread bruce blosser



bruce blosser  added the comment:

ok -  well sorry, I am obviously in way over my head, and now very confused...  

I was just going by what was being said on a number of python web sites, 
including one where I am taking a class in intermediate python coding, and 
thought I was seeing a confiict between what i was being told, and what I was 
finding when running code.

so I will try not to come back here, unless I have some major problem, that 
seems more like a bug

thanks
bruce

--

___
Python tracker 
<https://bugs.python.org/issue39641>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue39641] concatenation of Tuples

2020-02-16 Thread bruce blosser



bruce blosser  added the comment:

read the advice...
Yes this does work:

  ("Hello", 1, None) + (23, 19.5, "Goodbye")
('Hello', 1, None, 23, 19.5, 'Goodbye')
because you are not creating a 3rd string!

but try this, and it will NOT work:

FatThing= [(5, 4, "First Place"),
   (6, 6, "Fifer Place"),
   (2, 2, "Slowr Place")]
print(FatThing)  #this works

FFThing = FatThing + ('22', '32', '55')  #this causes an error!

however if you change all the members to strings, it will work!!!

--
status: closed -> open

___
Python tracker 
<https://bugs.python.org/issue39641>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue39641] concatenation of Tuples

2020-02-15 Thread bruce blosser



New submission from bruce blosser :

The concatenation of two tuples into a third tuple, using the + command, causes 
an error if every member of each of the two tuples is NOT a string!  This does 
not appear to be documented ANYWHERE, and really causes a whole lot of head 
scratching and more than enough foul language!  :)

So how does one "add" two tuples together, to create a third tuple, if the 
members are not all strings?

--
messages: 362036
nosy: bruceblosser
priority: normal
severity: normal
status: open
title: concatenation of Tuples
type: behavior
versions: Python 3.7

___
Python tracker 
<https://bugs.python.org/issue39641>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue36051] Drop the GIL during large bytes.join operations?

2020-01-16 Thread Bruce Merry



Bruce Merry  added the comment:

I think I've addressed the concerns that were raised in this bug, but let me 
know if I've missed any.

--

___
Python tracker 
<https://bugs.python.org/issue36051>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue36051] Drop the GIL during large bytes.join operations?

2020-01-05 Thread Bruce Merry



Bruce Merry  added the comment:

I ran the test on a Xeon machine (Skylake-XP) and it also looks like 
performance is only improved from 1MB up (somewhat to my surprise, given how 
poor single-threaded memcpy performance is on that machine). So I've updated 
the pull request with that threshold.

--

___
Python tracker 
<https://bugs.python.org/issue36051>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue36051] Drop the GIL during large bytes.join operations?

2020-01-05 Thread Bruce Merry



Bruce Merry  added the comment:

I've written a variant of the benchmark in which one thread does joins and the 
other does unrelated CPU-bound work that doesn't touch memory much. It also 
didn't show much benefit to thresholds below 512KB. I still want to test things 
on a server-class CPU, but based on the evidence so far I'm okay with a 1MB 
threshold.

--

___
Python tracker 
<https://bugs.python.org/issue36051>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue36051] Drop the GIL during large bytes.join operations?

2020-01-02 Thread Bruce Merry



Bruce Merry  added the comment:

I'm realising that the benchmark makes it difficult to see what's going on 
because it doesn't separate overhead costs (slowdowns because 
releasing/acquiring the GIL is not free, particularly when contended) from 
cache effects (slowdowns due to parallel threads creating more cache pressure 
than threads that take turns). inada.naoki's version of the benchmark is better 
here because it uses the same input data for all the threads, but the output 
data will still be different in each thread.

For example, on my system I see a big drop in speedup (although I still get 
speedup) with the new benchmark once the buffer size gets to 2MB per thread, 
which is not surprising with an 8MB L3 cache.

My feeling is that we should try to ignore cache effects when picking a 
threshold, because we can't predict them generically (they'll vary by workload, 
thread count, CPU etc) whereas users can benchmark specific use cases to decide 
whether multithreading gives them a benefit. If the threshold is too low then 
users can always choose not to use multi-threading (and in general one doesn't 
expect much from it in Python) but if the threshold is too high then users have 
no recourse. That being said, 65536 does still seem a bit low based on the 
results available.

I'll try to write a variant of the benchmark in which other threads just spin 
in Python without creating memory pressure to see if that gives a different 
picture. I'll also run the benchmark on a server CPU when I'm back at work.

--

___
Python tracker 
<https://bugs.python.org/issue36051>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue36051] Drop the GIL during large bytes.join operations?

2019-12-31 Thread Bruce Merry



Bruce Merry  added the comment:

> Do you think it would be sufficient to change the stress test from joining 
> 1000 items to joining 10 items?

Actually that won't work, because the existing stress test is using a non-empty 
separator. I'll add another version of that stress test that uses an empty 
separator.

--

___
Python tracker 
<https://bugs.python.org/issue36051>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue36051] Drop the GIL during large bytes.join operations?

2019-12-31 Thread Bruce Merry



Bruce Merry  added the comment:

> I'll take a look at extra unit tests soon. Do you know off the top of your 
> head where to look for existing `join` tests to add to?

Never mind, I found it: 
https://github.com/python/cpython/blob/92709a263e9cec0bc646ccc1ea051fc528800d8d/Lib/test/test_bytes.py#L535-L559

Do you think it would be sufficient to change the stress test from joining 1000 
items to joining 10 items?

--

___
Python tracker 
<https://bugs.python.org/issue36051>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue36051] Drop the GIL during large bytes.join operations?

2019-12-31 Thread Bruce Merry



Bruce Merry  added the comment:

I've attached a benchmark script and CSV results for master (whichever version 
that was at the point I forked) and with unconditional dropping of the GIL. It 
shows up to 3x performance improvement when using 4 threads. That's on my home 
desktop, which is quite old (Sandy Bridge). I'm expecting more significant 
gains on server CPUs, whose memory systems are optimised for multi-threaded 
workloads. The columns are chunk size, number of chunks, number of threads, and 
per-thread throughput.

There are also cases where using multiple threads is a slowdown, but I think 
that's an artifact of the benchmark. It repeatedly joins the same strings, so 
performance is higher when they all fit in the cache; when using 4 threads that 
execute in parallel, the working set is 4x larger and may cease to fit in 
cache. In real-world usage one is unlikely to be joining the same strings again 
and again.

In the single-threaded case, the benchmark seems to show that for 64K+, 
performance is improved by dropping the GIL (which I'm guessing must be 
statistical noise, since there shouldn't be anything contending for it), which 
is my reasoning behind the 65536 threshold.

I'll take a look at extra unit tests soon. Do you know off the top of your head 
where to look for existing `join` tests to add to?

--

___
Python tracker 
<https://bugs.python.org/issue36051>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue36051] Drop the GIL during large bytes.join operations?

2019-12-31 Thread Bruce Merry



Change by Bruce Merry :


Added file: https://bugs.python.org/file48813/benchjoin.py

___
Python tracker 
<https://bugs.python.org/issue36051>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue36051] Drop the GIL during large bytes.join operations?

2019-12-31 Thread Bruce Merry



Change by Bruce Merry :


Added file: https://bugs.python.org/file48812/new.csv

___
Python tracker 
<https://bugs.python.org/issue36051>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue36051] Drop the GIL during large bytes.join operations?

2019-12-31 Thread Bruce Merry



Change by Bruce Merry :


Added file: https://bugs.python.org/file48811/old.csv

___
Python tracker 
<https://bugs.python.org/issue36051>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue36051] Drop the GIL during large bytes.join operations?

2019-12-30 Thread Bruce Merry



Bruce Merry  added the comment:

If we want to be conservative, we could only drop the GIL if all the buffers 
pass the PyBytes_CheckExact test. Presumably that won't encounter any of these 
problems because bytes objects are immutable?

--

___
Python tracker 
<https://bugs.python.org/issue36051>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue36051] Drop the GIL during large bytes.join operations?

2019-12-30 Thread Bruce Merry



Change by Bruce Merry :


--
keywords: +patch
pull_requests: +17193
stage:  -> patch review
pull_request: https://github.com/python/cpython/pull/17757

___
Python tracker 
<https://bugs.python.org/issue36051>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue36051] Drop the GIL during large bytes.join operations?

2019-12-22 Thread Bruce Merry



Bruce Merry  added the comment:

> It seems we can release GIL during iterating the buffer array.

That's what I had in mind. Naturally it would require a bit of benchmarking to 
pick a threshold such that the small case doesn't lose performance due to 
locking overheads. If no one else is working on it, I can give that a try early 
next year.

--

___
Python tracker 
<https://bugs.python.org/issue36051>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue38242] Revert the new asyncio Streams API

2019-09-30 Thread Bruce Merry



Change by Bruce Merry :


--
nosy: +bmerry

___
Python tracker 
<https://bugs.python.org/issue38242>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue37141] Allow multiple separators in Stream.readuntil

2019-09-26 Thread Bruce Merry



Bruce Merry  added the comment:

I've submitted a PR: https://github.com/python/cpython/pull/16429

--

___
Python tracker 
<https://bugs.python.org/issue37141>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue37141] Allow multiple separators in Stream.readuntil

2019-09-26 Thread Bruce Merry



Change by Bruce Merry :


--
keywords: +patch
pull_requests: +16008
stage: test needed -> patch review
pull_request: https://github.com/python/cpython/pull/16429

___
Python tracker 
<https://bugs.python.org/issue37141>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue37141] Allow multiple separators in Stream.readuntil

2019-09-12 Thread Bruce Merry



Bruce Merry  added the comment:

I finally have permission from my employer to sign the contributors agreement, 
so I'll take a stab at this when I have some free time (unless nobody else gets 
to it first).

--

___
Python tracker 
<https://bugs.python.org/issue37141>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue37141] Allow multiple separators in Stream.readuntil

2019-06-03 Thread Bruce Merry



Bruce Merry  added the comment:

Ok, I've changed the issue title to refer to Stream. Since this would be a new 
feature, I assume it's off the table for 3.8, but I'll see if I get time to 
implement a PR in time for 3.9 (and get someone at work to sign off on the 
contributor agreement, which might be the harder part).

Thanks for the quick and helpful responses.

--
title: Allow multiple separators in StreamReader.readuntil -> Allow multiple 
separators in Stream.readuntil

___
Python tracker 
<https://bugs.python.org/issue37141>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue37141] Allow multiple separators in StreamReader.readuntil

2019-06-03 Thread Bruce Merry



Bruce Merry  added the comment:

Ok. Does the new Stream still have a similar interface for readuntil i.e. is 
this still a relevant request against the new API? I'm happy to let deprecated 
APIs stay as-is.

--

___
Python tracker 
<https://bugs.python.org/issue37141>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue37141] Allow multiple separators in StreamReader.readuntil

2019-06-03 Thread Bruce Merry



Bruce Merry  added the comment:

I wasn't aware of that deprecation - it doesn't seem to be mentioned at 
https://docs.python.org/3.8/library/asyncio-stream.html. What is the 
replacement?

--

___
Python tracker 
<https://bugs.python.org/issue37141>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue37141] Allow multiple separators in StreamReader.readuntil

2019-06-03 Thread Bruce Merry



New submission from Bruce Merry :

Text-based protocols sometimes allow a choice of newline separator - I work 
with one that allows either \r or \n. Unfortunately that doesn't work with 
StreamReader.readuntil, which only accepts a single separator, so I've had to 
do some hacky things to obtain lines without having to 

>From discussion in issue 32052, it sounded like extending 
>StreamReader.readuntil to support a tuple of separators would be feasible.

--
components: asyncio
messages: 344397
nosy: asvetlov, bmerry, yselivanov
priority: normal
severity: normal
status: open
title: Allow multiple separators in StreamReader.readuntil
type: enhancement
versions: Python 3.8

___
Python tracker 
<https://bugs.python.org/issue37141>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue32052] Provide access to buffer of asyncio.StreamReader

2019-06-03 Thread Bruce Merry



Bruce Merry  added the comment:

Ok, I'll open a separate issue to allow a tuple of possible separators.

--
nosy: +bmerry

___
Python tracker 
<https://bugs.python.org/issue32052>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue36051] (Performance) Drop the GIL during large bytes.join operations?

2019-02-20 Thread Bruce Merry



New submission from Bruce Merry :

A common pattern in libraries doing I/O is to receive data in chunks, put them 
in a list, then join them all together using b"".join(chunks). For example, see 
http.client.HTTPResponse._safe_read. When the output is large, the memory 
copies can block the interpreter for a non-trivial amount of time, and prevent 
multi-threaded scaling. If the GIL could be dropped during the memcpys it could 
improve parallel I/O performance in some high-bandwidth scenarios (36050 
mentions a case where I've run into this serialisation bottleneck in practice).

Obviously it could hurt performance to drop the GIL for small cases. As far as 
I know numpy uses thresholds to decide when it's worth dropping the GIL and it 
seems to work fairly well.

--
components: Interpreter Core
messages: 336082
nosy: bmerry
priority: normal
severity: normal
status: open
title: (Performance) Drop the GIL during large bytes.join operations?
versions: Python 3.7

___
Python tracker 
<https://bugs.python.org/issue36051>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue36050] Why does http.client.HTTPResponse._safe_read use MAXAMOUNT

2019-02-20 Thread Bruce Merry



New submission from Bruce Merry :

While investigating poor HTTP read performance I discovered that reading all 
the data from a response with a content-length goes via _safe_read, which in 
turn reads in chunks of at most MAXAMOUNT (1MB) before stitching them together 
with b"".join. This can really hurt performance for responses larger than 
MAXAMOUNT, because
(a) the data has to be copied an additional time; and
(b) the join operation doesn't drop the GIL, so this limits multi-threaded 
scaling.

I'm struggling to see any advantage in doing this chunking - it's not saving 
memory either (in fact it is wasting it).

To give an idea of the performance impact, changing MAXAMOUNT to a very large 
value made a multithreaded test of mine go from 800MB/s to 2.5GB/s (which is 
limited by the network speed).

--
components: Library (Lib)
messages: 336081
nosy: bmerry
priority: normal
severity: normal
status: open
title: Why does http.client.HTTPResponse._safe_read use MAXAMOUNT
versions: Python 3.7

___
Python tracker 
<https://bugs.python.org/issue36050>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue32052] Provide access to buffer of asyncio.StreamReader

2018-10-13 Thread Bruce Merry



Bruce Merry  added the comment:

A sequence of possible terminators would cover my immediate use case and 
certainly be an improvement.

To facilitate more general use cases without exposing implementation details, 
would it be practical and maintainable to have a "putback" method that prepends 
data to the buffer? It might not be fast in all cases (e.g. it might have to 
make a copy of what's still in the buffer), but possibly BufferedReader could 
detect the common case (putting back a suffix of what's just been read) and 
adjust its offsets into its internal buffer (although I'm not at all familiar 
with BufferedReader, so feel free to tell me I'm talking nonsense).

--

___
Python tracker 
<https://bugs.python.org/issue32052>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Observations on the List - "Be More Kind"

2018-10-05 Thread Bruce Coram

I will declare at the outset, I am a lurker.  I don't know enough about 
Python to give advice that I could 100% guarantee would be helpful.


There have been two recent threads that summarise for me where the 
Python Mailing List has lost its way (and this started before Trump 
arrived as a new role model for how to treat your fellow man):


"Re: This thread is closed [an actual new thread]"

"Re: So apparently I've been banned from this list"

The level of vitriol and personal attacks on the moderators was 
profoundly disappointing, but not totally out of character for those who 
made the attacks.  There is no doubt that these people know software and 
Python and this certainly earns my respect,  but perhaps they need to 
retain a sense of perspective.  There are 7 billion people in the 
world.  There are plenty more people at least as good as you and many 
better, but they don't have the compelling urge to demonstrate their 
genius.  They get on with their work in a quiet professional manner.


Some humility in acknowledging that you stand on the shoulders of giants 
would not go amiss.  It might also reflect that you understand the good 
fortune that dealt you such a good hand in life.


You aren't always right, and you don't always have to insist on being 
right.  I found Steve D'Aprano always had to have the last word and had 
to prove he was right.  I found some of his posts to be intemperate in tone.


Why is there a need to score points with caustic remarks or throwaway 
comments?  Perhaps the person who posed his question should have read 
the documents, perhaps he should have searched the archives.  Tell them 
so politely and firmly.  If you cannot manage that then why say 
anything?  Not everyone who posts a poorly prepared question is idle and 
deserves a response that is less than polite.  Prepare a boilerplate 
standard reply that is polite for those questions that are easily 
resolved by the poster.


Perhaps the person who posts something you regard as nonsense is 
ignorant and lacks the knowledge they think they possess.  Instead of 
wasting your time with scholastic debate, put the time to good use 
improving your education in subjects you don't excel at.  I can 
guarantee the depth of your ignorance will be profound - there will be 
much for you to learn.  The effort some of you put in to the endless 
debates suggests that you have plenty of time on your hands - don't 
waste it.  It will be gone soon enough.


Don't waste time on the trolls, some of whom undoubtedly enjoy the 
ability to provoke a response.  Develop a greater sense of self 
awareness to enable you to recognise that you are being played.  The 
intemperate tone of some of the exchanges damages the reputation of the 
List.


Life is hard enough without us adding to it.  Try silence as a response.

Listen to Frank Turner's latest album: "Be More Kind".   That is not a 
plug to buy the album, but the title seems apposite - and the music is good.


Regards

Bruce Coram
--
https://mail.python.org/mailman/listinfo/python-list

[issue34094] Porting Python 2 to Python 3 example contradicts its own advice

2018-07-11 Thread Bruce Richardson



Bruce Richardson  added the comment:

Ah, doh. my bad.

On 11 July 2018 at 16:09, Zachary Ware  wrote:

>
> Zachary Ware  added the comment:
>
> I don't agree with your conclusion here: importlib2 is a PyPI package that
> backports Python 3's importlib to Python 2, thus the ImportError will only
> be raised on Python 2 with the example as written.
>
> --
> nosy: +zach.ware
>
> ___
> Python tracker 
> <https://bugs.python.org/issue34094>
> ___
>

--

___
Python tracker 
<https://bugs.python.org/issue34094>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue34094] Porting Python 2 to Python 3 example contradicts its own advice

2018-07-11 Thread Bruce Richardson



New submission from Bruce Richardson :

https://docs.python.org/3/howto/pyporting.html#use-feature-detection-instead-of-version-detection

In this section, the (very good) advice is "It would be better to treat Python 
2 as the exceptional case instead of Python 3 and assume that future Python 
versions will be more compatible with Python 3 than Python 2"

However, it then goes on to present the best solution (for dealing with library 
imports) is this:

try:
from importlib import abc
except ImportError:
from importlib2 import abc

This is literally treating Python 3 as the exception, completely contradicting 
the advice given a few lines earlier.  Practically, it also has the effect 
that, as Python 3 adoption spreads, import errors and automatic retries will 
become *more* common and then universal, adding a small amount of delay and 
noise to the entire Python estate.  And that's not considering the case where 
both libraries are installed to cope with old code relying on the old library 
(in which case you surely want new code to default to using the new library)

If the example is simply changed to

try:
from importlib2 import abc
except ImportError:
from importlib import abc

then both the contradiction and the practical problems go away

--
assignee: docs@python
components: Documentation
messages: 321436
nosy: Bruce Richardson, brett.cannon, docs@python
priority: normal
severity: normal
status: open
title: Porting Python 2 to Python 3 example contradicts its own advice

___
Python tracker 
<https://bugs.python.org/issue34094>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue34047] IDLE: on macOS, scroll slider 'sticks' at bottom of file

2018-07-05 Thread Bruce Elgort


Bruce Elgort  added the comment:

Terry,

Here is a video I made showing the problem I’m having. I have no clue about the 
other things you are asking about.

https://www.youtube.com/watch?v=BpyMhdjTNvQ 
<https://www.youtube.com/watch?v=BpyMhdjTNvQ>

Bruce

> On Jul 5, 2018, at 3:21 PM, Terry J. Reedy  wrote:
> 
> 
> Change by Terry J. Reedy :
> 
> 
> --
> stage:  -> test needed
> title: Scrolling in IDLE for OS X is not working correctly when reaching end 
> of file -> IDLE: on macOS, scroll slider 'sticks' at bottom of file
> 
> ___
> Python tracker 
> <https://bugs.python.org/issue34047>
> ___

--

___
Python tracker 
<https://bugs.python.org/issue34047>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue34047] Scrolling in IDLE for OS X is not working correctly when reaching end of file

2018-07-05 Thread Bruce Elgort



Bruce Elgort  added the comment:

2.7.15 scrolling is working just fine.

--

___
Python tracker 
<https://bugs.python.org/issue34047>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue34047] Scrolling in IDLE for OS X is not working correctly when reaching end of file

2018-07-04 Thread Bruce Elgort



Change by Bruce Elgort :


--
title: Scrolling in IDLE for OS X is not working -> Scrolling in IDLE for OS X 
is not working correctly when reaching end of file

___
Python tracker 
<https://bugs.python.org/issue34047>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue34047] Scrolling in IDLE for OS X is not working

2018-07-04 Thread Bruce Elgort



New submission from Bruce Elgort :

When using IDLE on OS X and scrolling to the bottom of a file you are not able 
to scroll using a mouse back up. You need to use the arrow keys.

--
assignee: terry.reedy
components: IDLE
messages: 321058
nosy: belgort, terry.reedy
priority: normal
severity: normal
status: open
title: Scrolling in IDLE for OS X is not working
type: behavior
versions: Python 3.7

___
Python tracker 
<https://bugs.python.org/issue34047>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue32395] asyncio.StreamReader.readuntil is not general enough

2017-12-20 Thread Bruce Merry

New submission from Bruce Merry <bme...@gmail.com>:

I'd proposed one specific solution in Issue 32052 which asvetlov didn't like,
so as requested I'm filing a bug about the problem rather than the solution.

The specific case I have is reading a protocol in which either \r or \n can be
used to terminate lines. With StreamReader.readuntil, it's only possible to
specify one separator, so it can't easily be used (*).

Some nice-to-have features, from specific to general:
1. Specify multiple alternate separators.
2. Specify a regex for a separator.
3. Specify a regex for the line.
4. Specify a callback that takes a string and returns the position of the end
of the line, if any.

Of course, some of these risk quadratic-time behaviour if they have to check
the whole buffer every time the buffer is extended, so that would need to be
considered in the design. In the last case, the callback could take care of it
itself by maintaining internal state.

(*) I actually have a solution for this case
(https://github.com/ska-sa/aiokatcp/blob/bd8263cefe213003a218fac0dd8c5207cc76aeef/aiokatcp/connection.py#L44-L52),
but it only works because \r and \n are semantically equivalent in the
particular protocol I'm parsing.

--
components: asyncio
messages: 308852
nosy: Bruce Merry, yselivanov
priority: normal
severity: normal
status: open
title: asyncio.StreamReader.readuntil is not general enough
type: enhancement
versions: Python 3.7

___
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue32395>
___
___
Python-bugs-list mailing list
Unsubscribe:
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue32052] Provide access to buffer of asyncio.StreamReader

2017-11-16 Thread Bruce Merry


New submission from Bruce Merry <bme...@gmail.com>:

While asyncio.StreamReader.readuntil is an improvement on only having readline, 
it is still quite limited e.g. you cannot have multiple possible terminators. 
The real problem is that it's not possible to roll your own without accessing 
_underscore fields (other than by reading one byte at a time, which I'm 
guessing would be bad for performance). I'm not sure exactly what a public API 
to assist would look like, but I think the following would be a good start:

1. A get_buffer method, that returns (self._buffer, self._eof); the caller must 
treat the buffer as readonly.
2. A wait_for_data method to wait for the return value of get_buffer to change 
(basically like current _wait_for_data)
3. Access to the _limit attribute.

With that available, I think readuntil or more complex variants of it could be 
implemented externally using only the public interface (consumption of data 
from the buffer would be via readexactly rather than by messing with the buffer 
array directly).

--
components: asyncio
messages: 306397
nosy: Bruce Merry, yselivanov
priority: normal
severity: normal
status: open
title: Provide access to buffer of asyncio.StreamReader
type: enhancement
versions: Python 3.7

___
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue32052>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue29902] copy breaks staticmethod

2017-03-25 Thread Bruce Frederiksen


New submission from Bruce Frederiksen:

Doing a copy on a staticmethod breaks it:

Python 3.5.2 (default, Nov 17 2016, 17:05:23) 
[GCC 5.4.0 20160609] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from copy import copy
>>> def foo(): pass
... 
>>> class bar: pass
... 
>>> bar.x = staticmethod(foo)
>>> bar.x.__name__
'foo'
>>> bar.y = copy(staticmethod(foo))
>>> bar.y.__name__
Traceback (most recent call last):
  File "", line 1, in 
RuntimeError: uninitialized staticmethod object

--
components: Library (Lib)
messages: 290481
nosy: dangyogi
priority: normal
severity: normal
status: open
title: copy breaks staticmethod
type: behavior
versions: Python 3.5

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue29902>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue29001] logging.handlers.RotatingFileHandler rotation broken under gunicorn

2016-12-17 Thread Bruce Edge


New submission from Bruce Edge:

I've been seeing some funny behavior in log files form a 
logging.handlers.RotatingFileHandler.

Here's my handler config snippet, in a "log.cfg":

[handlers]
keys=log_file

[formatters]
keys=access

[formatter_access]
format=%(asctime)s - %(name)s(%(funcName)s:%(lineno)s)[%(process)d] - 
%(levelname)s: %(message)s
datefmt=%Y-%m-%d %H:%M:%S
class=logging.Formatter

[handler_log_file]
class=logging.handlers.RotatingFileHandler
formatter=access
args=('/var/log/ml-api/access.log', 'a', 1000, 16, 'utf8')

Specified using:

 /usr/local/bin/gunicorn --worker-class=gevent --bind unix:ml-api.sock -m 007 
--workers=8 --log-config=log.cfg --log-level=DEBUG app

I ran this script to show the progression of log file creation:

while true ; do ls -lrt && sleep 10; done

Initially there's one incrementing log file, fine:

-rw-r--r-- 1 ubuntu www-data 9765042 Dec 16 17:50 access.log
total 9572
-rw-r--r-- 1 ubuntu www-data 9796828 Dec 16 17:50 access.log
total 9656
-rw-r--r-- 1 ubuntu www-data 9881053 Dec 16 17:50 access.log
total 9708
-rw-r--r-- 1 ubuntu www-data 9936776 Dec 16 17:50 access.log
total 9756
-rw-r--r-- 1 ubuntu www-data 9984782 Dec 16 17:50 access.log
total 9816

But as soon as it gets rotated, I immediately get 8 log files:

-rw-r--r-- 1 ubuntu www-data 911 Dec 16 17:50 access.log.7
-rw-r--r-- 1 ubuntu www-data2578 Dec 16 17:50 access.log.5
-rw-r--r-- 1 ubuntu www-data2578 Dec 16 17:50 access.log.3
-rw-r--r-- 1 ubuntu www-data6871 Dec 16 17:50 access.log.2
-rw-r--r-- 1 ubuntu www-data5122 Dec 16 17:50 access.log.1
-rw-r--r-- 1 ubuntu www-data   11165 Dec 16 17:50 access.log.4
-rw-r--r-- 1 ubuntu www-data1718 Dec 16 17:50 access.log
-rw-r--r-- 1 ubuntu www-data2905 Dec 16 17:50 access.log.6
total 9864
-rw-r--r-- 1 ubuntu www-data 911 Dec 16 17:50 access.log.7
-rw-r--r-- 1 ubuntu www-data8921 Dec 16 17:50 access.log.6
-rw-r--r-- 1 ubuntu www-data   15460 Dec 16 17:50 access.log
-rw-r--r-- 1 ubuntu www-data   10313 Dec 16 17:51 access.log.2
-rw-r--r-- 1 ubuntu www-data7699 Dec 16 17:51 access.log.3
-rw-r--r-- 1 ubuntu www-data   21471 Dec 16 17:51 access.log.4
-rw-r--r-- 1 ubuntu www-data6874 Dec 16 17:51 access.log.5
-rw-r--r-- 1 ubuntu www-data   11989 Dec 16 17:51 access.log.1
total 9892
-rw-r--r-- 1 ubuntu www-data 911 Dec 16 17:50 access.log.7
-rw-r--r-- 1 ubuntu www-data7699 Dec 16 17:51 access.log.3
-rw-r--r-- 1 ubuntu www-data   12849 Dec 16 17:51 access.log.1
-rw-r--r-- 1 ubuntu www-data   14936 Dec 16 17:51 access.log.6
-rw-r--r-- 1 ubuntu www-data   30068 Dec 16 17:51 access.log.4
-rw-r--r-- 1 ubuntu www-data   19755 Dec 16 17:51 access.log
-rw-r--r-- 1 ubuntu www-data   11170 Dec 16 17:51 access.log.5
-rw-r--r-- 1 ubuntu www-data   15466 Dec 16 17:51 access.log.2
total 9932

Is this a consequence of the gunicorn --workers=8?

Does logging.handlers.RotatingFileHandler not work with gunicorn workers?

I tried --worker-class=gevent as well with the same result.

--
components: Extension Modules
messages: 283511
nosy: Bruce Edge
priority: normal
severity: normal
status: open
title: logging.handlers.RotatingFileHandler rotation broken under gunicorn
type: behavior
versions: Python 2.7

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue29001>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue27586] Is this a regular expression library bug?

2016-07-22 Thread Bruce Eckel


Bruce Eckel added the comment:

Thank you ebarry, very helpful. Tim, sorry I missed you at Pycon.

--

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue27586>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue27586] Is this a regular expression library bug?

2016-07-21 Thread Bruce Eckel


Bruce Eckel added the comment:

Urk. There was exactly a \g in the input. Sorry for the bother.

--
resolution:  -> not a bug

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue27586>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue27586] Is this a regular expression library bug?

2016-07-21 Thread Bruce Eckel


Bruce Eckel added the comment:

Sorry, I thought maybe the error message would be indicative of something. 
Here's the re:

find_output = re.compile(r"/\* (Output:.*)\*/", re.DOTALL)

Here's the program:

#! py -3
# Requires Python 3.5
# Updates generated output into extracted Java programs in "On Java 8"
from pathlib import Path
import re
import pprint
import sys

if __name__ == '__main__':
find_output = re.compile(r"/\* (Output:.*)\*/", re.DOTALL)
for outfile in Path(".").rglob("*.p1"):
print(str(outfile))
javafile = outfile.with_suffix(".java")
if not javafile.exists():
print(str(outfile) + " has no javafile")
sys.exit(1)
javatext = javafile.read_text()
if "/* Output:" not in javatext:
print(str(javafile) + " has no /* Output:")
sys.exit(1)
new_output = outfile.read_text()
new_javatext = find_output.sub(new_output, javatext)
javafile.write_text(new_javatext)

--

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue27586>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue27586] Is this a regular expression library bug?

2016-07-21 Thread Bruce Eckel


New submission from Bruce Eckel:

This looks suspicious to me, like it could be a library bug, but before chasing 
it down I was hoping someone might be able to tell me whether I might be on to 
something:

Traceback (most recent call last):
  File "update_extracted_example_output.py", line 22, in 
new_javatext = find_output.sub(new_output, javatext)
  File "C:\Python35\lib\re.py", line 325, in _subx
template = _compile_repl(template, pattern)
  File "C:\Python35\lib\re.py", line 312, in _compile_repl
p = sre_parse.parse_template(repl, pattern)
  File "C:\Python35\lib\sre_parse.py", line 872, in parse_template
raise s.error("missing <")
sre_constants.error: missing < at position 100 (line 4, column 41)

--
components: Library (Lib)
messages: 270956
nosy: Bruce Eckel
priority: normal
severity: normal
status: open
title: Is this a regular expression library bug?
type: compile error
versions: Python 3.5

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue27586>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Re: Python to do CDC on XML files

2016-03-24 Thread Bruce Kirk

I agree, the challenge is the volume of the data to compare is 13. Million 
records. So it needs to be very fast

Sent from my iPad

> On Mar 23, 2016, at 4:47 PM, Bob Gailer <bgai...@gmail.com> wrote:
> 
> 
> On Mar 23, 2016 4:20 PM, "Bruce Kirk" <bruce.kir...@gmail.com> wrote:
> >
> > Does anyone know of any existing projects on how to generate a change data 
> > capture on 2 very large xml files.
> >
> > The xml structures are the same, it is the data within the files that may 
> > differ.
> >
> It should not be too difficult to write a program that locates the tags 
> delimiting each record, then compare them.
-- 
https://mail.python.org/mailman/listinfo/python-list

Python to do CDC on XML files

2016-03-23 Thread Bruce Kirk

Does anyone know of any existing projects on how to generate a change data 
capture on 2 very large xml files.

The xml structures are the same, it is the data within the files that may 
differ.

I need to take a XML file from yesterday and compare it to the XML file 
produced today and not which XML records have changed.

I have done a google search and I am not able to find much on the subject other 
than software vendors trying to sell me their products. :-)

Regards
-- 
https://mail.python.org/mailman/listinfo/python-list

[issue25991] readline example eventually consumes all memory

2016-01-01 Thread Bruce Frederiksen


New submission from Bruce Frederiksen:

The Example in the readline documentation (section 6.7 of the Library 
Reference) shows how to save your readline history in a file, and restore it 
each time you start Python.

The problem with the Example is that it does not include a call to 
readline.set_history_length and the default is -1 (infinite).

As a Python developer, I start Python quite a lot and had a .python_history 
file that was 850M bytes.  Just starting Python was causing my system to thrash 
before the first prompt (>>>) even appeared.

I suggest adding the following line to the example to avoid this:

readline.set_history_length(1000)

I'm not sure how far back this goes in terms of earlier versions of Python, but 
probably quite far.

--
assignee: docs@python
components: Documentation
messages: 257325
nosy: dangyogi, docs@python
priority: normal
severity: normal
status: open
title: readline example eventually consumes all memory
type: resource usage
versions: Python 3.4

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue25991>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

How does one distribute Tkinter or Qt GUI apps Developed in Python

2015-12-16 Thread Bruce Whealton

I watched one training video that discussed Python and Tkinter. Like many 
similar tutorials from online training sites, I was left scratching my head. 

What seems to be blatantly missing is how this would be distributed. In the 
first mentioned tutorial from Lynda.com the Tkinter app was related to a web 
page. However, the browser cannot run Python Bytecode or Python Scripts. 

Surely, one is going to want to create GUI apps for users that are not Python 
Developers. I would not think to ask someone to install Python on their system 
and make sure it is added to the path. Maybe it is not so hard for the 
non-technical, average users. 

I would want to package in some way so that when launched, it installs whatever 
is needed on the end user's computer. How is this done? 
Are there common practices for this? 
Thanks, 
Bruce
-- 
https://mail.python.org/mailman/listinfo/python-list

try/exception - error block

2014-08-03 Thread bruce

Hi.

I have a long running process, it generates calls to a separate py
app. The py app appears to generate errors, as indicated in the
/var/log/messages file for the abrtd daemon.. The errors are
intermittent.

So, to quickly capture all possible exceptions/errors, I decided to
wrap the entire main block of the test py func in a try/exception
block.

This didn't work, as I'm not getting any output in the err file
generated in the exception block.

I'm posting the test code I'm using. Pointers/comments would be helpful/useful.


 the if that gets run is the fac1 logic which operates on the input
packet/data..
elif (level=='collegeFaculty1'):
#getClasses(url, college, termVal,termName,deptName,deptAbbrv)
  ret=getParseCollegeFacultyList1(url,content)


Thanks.

if __name__ == __main__:
# main app

  try:
#college=asu
#url=https://webapp4.asu.edu/catalog;
#termurl=https://webapp4.asu.edu/catalog/TooltipTerms.ext;


#termVal=2141
#
# get the input struct, parse it, determine the level
#

#cmd='cat /apps/parseapp2/asuclass1.dat'
#print cmd= +cmd
#proc=subprocess.Popen(cmd, shell=True,stdout=subprocess.PIPE)
#content=proc.communicate()[0].strip()
#print content
#sys.exit()

#s=getClasses(content)

#print arg1 =,sys.argv[0]
if(len(sys.argv)2):
  print error\n
  sys.exit()

a=sys.argv[1]
aaa=a

#
# data is coming from the parentApp.php
#data has been rawurlencode(json_encode(t))
#-reverse/split the data..
#-do the fetch,
#-save the fetched page/content if any
#-create the returned struct
#-echo/print/return the struct to the
# calling parent/call
#

##print urllib.unquote_plus(a).decode('utf8')
#print \n
#print simplejson.loads(urllib.unquote_plus(a))
z=simplejson.loads(urllib.unquote_plus(a))
##z=simplejson.loads(urllib.unquote(a).decode('utf8'))
#z=simplejson.loads(urllib2.unquote(a).decode('utf8'))

#print aa \n
print z
#print \n bb \n

#
#-passed in
#
url=str(z['currentURL'])
level=str(z['level'])
cname=str(z['parseContentFileName'])


#
# need to check the contentFname
# -should have been checked in the parentApp
# -check it anyway, return err if required
# -if valid, get/import the content into
# the content var for the function/parsing
#

##cmd='echo ${yolo_clientFetchOutputDir}/'
cmd='echo ${yolo_clientParseInputDir}/'
#print cmd= +cmd
proc=subprocess.Popen(cmd, shell=True,stdout=subprocess.PIPE)
cpath=proc.communicate()[0].strip()

cname=cpath+cname
#print cn = +cname+\n
#sys.exit()


cmd='test -e '+cname+'  echo 1'
#print cmd= +cmd
proc=subprocess.Popen(cmd, shell=True,stdout=subprocess.PIPE)
c1=proc.communicate()[0].strip()

if(not c1):
  #got an error - process it, return
  print error in parse

#
# we're here, no err.. got content
#

#fff= sdsu2.dat
with open(cname,r) as myfile:
  content=myfile.read()
  myfile.close()


#-passed in
#college=louisville
#url=http://htmlaccess.louisville.edu/classSchedule/;
#termVal=4138


#print term = +str(termVal)+\n
#print url = +url+\n

#jtest()
#sys.exit()

#getTerm(url,college,termVal)


ret={} # null it out to start
if (level=='rState'):
  #ret=getTerm(content,termVal)
  ret=getParseStates(content)

elif (level=='stateCollegeList'):
#getDepts(url,college, termValue,termName)
  ret=getParseStateCollegeList(url,content)

elif (level=='collegeFaculty1'):
#getClasses(url, college, termVal,termName,deptName,deptAbbrv)
  ret=getParseCollegeFacultyList1(url,content)

elif (level=='collegeFaculty2'):
#getClasses(url, college, termVal,termName,deptName,deptAbbrv)
  ret=getParseCollegeFacultyList2(content)



#
# the idea of this section.. we have the resulting
# fetched content/page...
#

a={}
status=False
if(ret['status']==True):

  s=ascii_strip(ret['data'])
  if(((s.find(/html)-1) or (s.find(/HTML)-1)) and
  ((s.find(html)-1) or (s.find(HTML)-1)) and
   level=='classSectionDay'):

status=True
  #print herh
  #sys.exit()

  #
  # build the returned struct
  #
  #

  a['Status']=True
  a['recCount']=ret['count']
  a['data']=ret['data']
  a['nextLevel']=''
  a['timestamp']=''
  a['macAddress']=''
elif(ret['status']==False):
  a['Status']=False
  a['recCount']=0
  a['data']=''
  a['nextLevel']=''
  a['timestamp']=''
  a['macAddress']=''

res=urllib.quote(simplejson.dumps(a))
##print res

name=subprocess.Popen('uuidgen -t', shell=True,stdout=subprocess.PIPE)
name=name.communicate()[0].strip()

Re: try/exception - error block

2014-08-03 Thread bruce

chris.. my bad.. I wasnt intending to mail you personally.

Or I wouldn't have inserted the thanks guys!

 thanks guys...

 but in all that.. no one could tell me .. why i'm not getting any
 errs/exceptions in the err file which gets created on the exception!!!

 but thanks for the information on posting test code!

Don't email me privately - respond to the list :)

Also, please don't top-post.

ChrisA

On Sun, Aug 3, 2014 at 10:29 AM, bruce badoug...@gmail.com wrote:
 Hi.

 I have a long running process, it generates calls to a separate py
 app. The py app appears to generate errors, as indicated in the
 /var/log/messages file for the abrtd daemon.. The errors are
 intermittent.

 So, to quickly capture all possible exceptions/errors, I decided to
 wrap the entire main block of the test py func in a try/exception
 block.

 This didn't work, as I'm not getting any output in the err file
 generated in the exception block.

 I'm posting the test code I'm using. Pointers/comments would be 
 helpful/useful.

 
  the if that gets run is the fac1 logic which operates on the input
 packet/data..
 elif (level=='collegeFaculty1'):
 #getClasses(url, college, termVal,termName,deptName,deptAbbrv)
   ret=getParseCollegeFacultyList1(url,content)
 

 Thanks.

 if __name__ == __main__:
 # main app

   try:
 #college=asu
 #url=https://webapp4.asu.edu/catalog;
 #termurl=https://webapp4.asu.edu/catalog/TooltipTerms.ext;


 #termVal=2141
 #
 # get the input struct, parse it, determine the level
 #

 #cmd='cat /apps/parseapp2/asuclass1.dat'
 #print cmd= +cmd
 #proc=subprocess.Popen(cmd, shell=True,stdout=subprocess.PIPE)
 #content=proc.communicate()[0].strip()
 #print content
 #sys.exit()

 #s=getClasses(content)

 #print arg1 =,sys.argv[0]
 if(len(sys.argv)2):
   print error\n
   sys.exit()

 a=sys.argv[1]
 aaa=a

 #
 # data is coming from the parentApp.php
 #data has been rawurlencode(json_encode(t))
 #-reverse/split the data..
 #-do the fetch,
 #-save the fetched page/content if any
 #-create the returned struct
 #-echo/print/return the struct to the
 # calling parent/call
 #

 ##print urllib.unquote_plus(a).decode('utf8')
 #print \n
 #print simplejson.loads(urllib.unquote_plus(a))
 z=simplejson.loads(urllib.unquote_plus(a))
 ##z=simplejson.loads(urllib.unquote(a).decode('utf8'))
 #z=simplejson.loads(urllib2.unquote(a).decode('utf8'))

 #print aa \n
 print z
 #print \n bb \n

 #
 #-passed in
 #
 url=str(z['currentURL'])
 level=str(z['level'])
 cname=str(z['parseContentFileName'])


 #
 # need to check the contentFname
 # -should have been checked in the parentApp
 # -check it anyway, return err if required
 # -if valid, get/import the content into
 # the content var for the function/parsing
 #

 ##cmd='echo ${yolo_clientFetchOutputDir}/'
 cmd='echo ${yolo_clientParseInputDir}/'
 #print cmd= +cmd
 proc=subprocess.Popen(cmd, shell=True,stdout=subprocess.PIPE)
 cpath=proc.communicate()[0].strip()

 cname=cpath+cname
 #print cn = +cname+\n
 #sys.exit()


 cmd='test -e '+cname+'  echo 1'
 #print cmd= +cmd
 proc=subprocess.Popen(cmd, shell=True,stdout=subprocess.PIPE)
 c1=proc.communicate()[0].strip()

 if(not c1):
   #got an error - process it, return
   print error in parse

 #
 # we're here, no err.. got content
 #

 #fff= sdsu2.dat
 with open(cname,r) as myfile:
   content=myfile.read()
   myfile.close()


 #-passed in
 #college=louisville
 #url=http://htmlaccess.louisville.edu/classSchedule/;
 #termVal=4138


 #print term = +str(termVal)+\n
 #print url = +url+\n

 #jtest()
 #sys.exit()

 #getTerm(url,college,termVal)


 ret={} # null it out to start
 if (level=='rState'):
   #ret=getTerm(content,termVal)
   ret=getParseStates(content)

 elif (level=='stateCollegeList'):
 #getDepts(url,college, termValue,termName)
   ret=getParseStateCollegeList(url,content)

 elif (level=='collegeFaculty1'):
 #getClasses(url, college, termVal,termName,deptName,deptAbbrv)
   ret=getParseCollegeFacultyList1(url,content)

 elif (level=='collegeFaculty2'):
 #getClasses(url, college, termVal,termName,deptName,deptAbbrv)
   ret=getParseCollegeFacultyList2(content)



 #
 # the idea of this section.. we have the resulting
 # fetched content/page...
 #

 a={}
 status=False
 if(ret['status']==True):

   s=ascii_strip(ret['data'])
   if(((s.find(/html)-1) or (s.find(/HTML)-1)) and
   ((s.find(html)-1) or (s.find(HTML)-1)) and
level=='classSectionDay'):

 status=True
   #print

Re: try/exception - error block

2014-08-03 Thread bruce

Hi Alan.

Yep, the err file in the exception block gets created. and the weird
thing is it matches the time of the abrtd information in the
/var/log/messages log..

Just nothing in the file!



On Sun, Aug 3, 2014 at 4:01 PM, Alan Gauld alan.ga...@btinternet.com wrote:
 On 03/08/14 18:52, bruce wrote:

 but in all that.. no one could tell me .. why i'm not getting any
 errs/exceptions in the err file which gets created on the exception!!!


 Does the file actually get created?
 Do you see the print statement output - are they what you expect?

 Did you try the things Steven suggested.


except Exception, e:
  print e
  print pycolFac1 - error!! \n;
  name=subprocess.Popen('uuidgen -t',
 shell=True,stdout=subprocess.PIPE)
  name=name.communicate()[0].strip()
  name=name.replace(-,_)


 This is usually a bad idea. You are using name for the process and its
 output. Use more names...
 What about:

 uuid=subprocess.Popen('uuidgen -t',shell=True,stdout=subprocess.PIPE)
 output=uuid.communicate()[0].strip()
 name=output.replace(-,_)

  name2=/home/ihubuser/parseErrTest/pp_+name+.dat


 This would be a good place to insert a print

 print name2

  ofile1=open(name2,w+)


 Why are you using w+ mode? You are only writing.
 Keep life as simple as possible.

  ofile1.write(e)


 e is quite likely to be empty

  ofile1.write(aaa)


 Are you sure aaa exists at this point? Remember you are catching all errors
 so if an error happens prior to aaa being created this will
 fail.

  ofile1.close()


 You used the with form earlier, why not here too.
 It's considered better style...

 Some final comments.
 1) You call sys.exit() several times inside
 the try block. sys.exit will not be caught by your except block,
 is that what you expect?.

 2) The combination of confusing naming of variables,
 reuse of names and poor code layout and excessive commented
 code makes it very difficult to read your code.
 That makes it hard to figure out what might be going on.
 - Use sensible variable names not a,aaa,z, etc
 - use 3 or 4 level indentation not 2
 - use a version control system (RCS,CVS, SVN,...) instead
   of commenting out big blocks
 - use consistent code style
  eg with f as ... or open(f)/close(f) but not both
 - use the os module (and friends) instead of subprocess if possible

 3) Have you tried deleting all the files in the
 /home/ihubuser/parseErrTest/ folder and starting again,
 just to be sure that your current code is actually
 producing the empty files?

 4) You use tmpParseDir in a couple of places but I don't
 see it being set anywhere?


 That's about the best I can offer based on the
 information available.

 --
 Alan G
 Author of the Learn to Program web site
 http://www.alan-g.me.uk/
 http://www.flickr.com/photos/alangauldphotos

 --
 https://mail.python.org/mailman/listinfo/python-list
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Prob. Code Downloaded for Programming the Semantic Web (python code)

2014-07-28 Thread Bruce Whealton

On Friday, July 25, 2014 9:28:32 PM UTC-4, Steven D'Aprano wrote:
 On Fri, 25 Jul 2014 17:06:17 -0700, Bruce Whealton wrote:
Steven,
See below please.  The explanation did help.   
 
  OK, Eclipse with PyDev doesn't like this first line, with the function:
 
  def add(self, (sub, pred, obj)):
 
 
 
 In Python 2, you could include parenthesised parameters inside function
 declarations as above. That is effectively a short cut for this version,
 where you collect a single argument and then expand it into three
 variables:
 
 
 
 def add(self, sub_pred_obj):
 
 sub, pred, obj = sub_pred_obj
 
I setup Eclipse to use python 2.7.x and tried to run this and it just gave an 
error on line 9 where the def add function is declared.  It just says invalid 
syntax and points at the parentheses that are in the function definition
def add(self, (subj, pred, obj)):
So, from what you said, and others, it seems like this should have worked but 
eclipse would not run it.  I could try to load it into IDLE.
 
 
 
 In Python 3, that functionality was dropped and is no longer allowed. Now
 you have to use the longer form.

I'm not sure I follow what the longer method is.  Can you explain that more, 
please. 
 
 
 [...]
 
  There are other places where I thought that there were too many
 
  parentheses and I tried removing one set of them.  For example this
 
  snippet here:
 
  
 
  def remove(self, (sub, pred, obj)):
 
  
 
  Remove a triple pattern from the graph. 
 
  triples = list(self.triples((sub, pred, obj)))
 
 
 
 Firstly, the remove method expects to take a *single* argument (remember
 
 that self is automatically provided by Python) which is then automatically
 
 expanded into three variables sub, pred, obj. So you have to call it with a
 
 list or tuple of three items (or even a string of length exactly 3).
 

 Then, having split this list or tuple into three items, it then joins them
 
 back again into a tuple:
 
 
 
 (sub, pred, obj)
 
 passes that tuple to the triples method:
 
 self.triples((sub, pred, obj))
 
 
 
 (not shown, but presumably it uses the same parenthesised parameter trick),
 
 and then converts whatever triples returns into a list:
 
The full code listing should be available in the code paste link that I 
included.
 
 
 list(self.triples((sub, pred, obj)))
 
 
 
 that list then being bound to the name triples.
 
 
Thanks, the explanation helped,
Bruce
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Prob. Code Downloaded for Programming the Semantic Web (python code)

2014-07-28 Thread Bruce Whealton

On Friday, July 25, 2014 11:25:15 PM UTC-4, Chris Angelico wrote:
 On Sat, Jul 26, 2014 at 10:06 AM, Bruce Whealton
 
Chris,
In response to your comments below, I'm comfortable changing this to use 
python 3.
 As others have said, this is something that changed in Python 3. So
 you have two parts to the problem: firstly, your code is bound to
 Python 2 by a triviality, and secondly, Eclipse is complaining about
 it.
 
 
 
 But a better solution, IMO, would be to avoid that implicit tuple
 unpacking. It's not a particularly clear feature, and I'm not sorry
 it's gone from Py3. The simplest way to change it is to just move it
 into the body:
 

OK, that makes sense. So, I cut out the Alternatively...  suggestion you made.
 
 
 def add(self, args):
 
 sub, pred, obj = args
 
 # rest of code as before
 
 
 
 Preferably with a better name than 'args'.

Yes, I could call it triples. 
 
  triples = list(self.triples((sub, pred, obj)))
 
 
 
  Are the two sets parentheses needed after self.triples?  That syntax is
 
  confusing to me.  It seems that it should be
 
  triples = list(self.triples(sub, pred, obj))
 
 
 
 No, that's correct. The extra parens force that triple to be a single
 
 tuple of three items, rather than three separate arguments. Here's a
 
 simpler example:
 
  lst = []
 
  lst.append(1,2,3)
 
 Traceback (most recent call last):
 
   File pyshell#25, line 1, in module
 
 lst.append(1,2,3)
 
 TypeError: append() takes exactly one argument (3 given)
 
  lst.append((1,2,3))
 
  addme = 4,5,6
 
  lst.append(addme)
 
  lst
 
 [(1, 2, 3), (4, 5, 6)]
 
 
This is helpful and makes sense... clarifies it for me.
 
 The list append method wants one argument, and appends that argument 
 to the list. Syntactically, the comma has multiple meanings; when I
 assign 4,5,6 to a single name, it makes a tuple, but in a function
 call, it separates args in the list. I don't see why the triples()
 function should be given a single argument, though; all it does is
 immediately unpack it. It'd be better to just remove the parens and 
 have separate args:
 
 
 triples = list(self.triples(sub, pred, obj))

I didn't see the above in the code... Is this something I would need to add and 
if so, where? 
 
 
def triples(self, sub, pred, obj):
 
 
 
 While I'm looking at the code, a few other comments. I don't know how
 much of this is your code and how much came straight from the book,
 but either way, don't take this as criticism, but just as suggestions 
 for ways to get more out of Python.
 
So far it is just from the book, and just serves as an example...  It is also a 
few years old, having been published in 2009.
 
 
 Inside remove(), you call a generator (triples() uses yield to return
 multiple values), then construct a list, and then iterate exactly once
 over that list. Much more efficient and clean to iterate directly over
 what triples() returns, as in save(); that's what generators are good
 for.
 
 
 
 In triples(), the code is deeply nested and repetitive. I don't know
 if there's a way to truly solve that, but I would be inclined to 
 flatten it out a bit; maybe check for just one presence, to pick your
 index, and then merge some of the code that iterates over an index.
 Not sure though.
 
I would have to get a better understanding of this.  
 
 
 (Also: It's conventional to use is not None rather than != None to
 
 test for singletons. It's possible for something to be equal to None
 
 without actually being None.)
 
 
 
 I would recommend moving to Python 3, if you can. Among other
 benefits, the Py3 csv module allows you to open a text file rather
 than opening a binary file and manually encoding/decoding all the
 parts separately. Alternatively, if you don't need this to be saving
 and loading another program's files, you could simply use a different
 file format, which would remove the restrictions (and messes) of the 
 CSV structure.

I was curious about why the binary flag was being used.  It just made no sense 
to me.
 
 
 
 Instead of explicitly putting f.close() at the end of your load and
 save methods, check out the 'with' statement. It'll guarantee that the
 file's closed even if you leave early, get an exception, or anything
 
 like that. Also, I'd tend to use the .decode() and .encode() methods,
 rather than the constructors. So here's how I'd write a Py2 load:

I would like to see this in python 3 format.
 
 def load(self, filename):
 
 with open(filename, rb) as f:
 
 for sub, pred, obj in csv.reader(f):
 
 self.add((sub.decode(UTF-8), pred.decode(UTF-8),
 
 obj.decode(UTF-8)))
 
 
 
 (You might want to break that back out into three more lines, but this
 
 parallels save(). If you break this one, you probably want to break
 
 save() too.)
 
 
 
 Hope that helps!
 
 
 
 ChrisA

Thanks,
Bruce

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Prob. Code Downloaded for Programming the Semantic Web (python code)

2014-07-28 Thread Bruce Whealton

On Monday, July 28, 2014 11:28:40 AM UTC-4, Steven D'Aprano wrote:
 On Mon, 28 Jul 2014 03:39:48 -0700, Bruce Whealton wrote:
Stephen,
I went to my Ubuntu box inside vmware and added a #!/usr/bin/env python2.7 
to the top.  Then I made the file executable and it ran the code perfectly. 

 
 First step is to confirm that Eclipse actually is using Python 2.7. Can 
 
 you get it to run this code instead? Put this in a module, and then run 
 
 it:
 
 
 
 import sys
 
 print(sys.version)
 
 
I had both python2.7 and python3.4.  I could be less specific with my shebang 
line but what the heck.  
 
 
 
I then installed pydev into my eclipse environment within the Ubuntu virtual 
machine and it ran the program just fine.  So, I suspect the extra character 
was 
only an issue on Windows.  I thought I had it setup to show even hidden 
characters.  
Anyway, thanks so much for all the help...everyone.  It might be interesting 
for me to convert this to a module that runs with python 3.
Bruce 
 
 
  It just
 
  says invalid syntax and points at the parentheses that are in the
 
  function definition def add(self, (subj, pred, obj)):
 
  So, from what you said, and others, it seems like this should have
 
  worked but eclipse would not run it.  I could try to load it into IDLE.
 
 
 
 Whenever you have trouble with one IDE, it's good to get a second opinion 
 
 in another IDE. They might both be buggy, but they're unlikely to both 
 
 have the same bug.
 
 
 
 Also, try to run the file directly from the shell, without an IDE. from 
 
 the system shell (cmd.exe if using Windows, bash or equivalent for 
 
 Linux), run:
 
 
 
 python27 /path/to/yourfile.py
 
 
 
 You'll obviously need to adjust the pathname, possibly even give the full 
 
 path to the Python executable.
 
 
 
 
 
 [...]
 
  In Python 3, that functionality was dropped and is no longer allowed.
 
  Now you have to use the longer form.
 
 
 
  I'm not sure I follow what the longer method is.  Can you explain that
 
  more, please.
 
 
 
 I referred to the parenthesised parameter version as a short cut for a 
 
 method that takes a single argument, then manually expands that argument 
 
 into three items. Let me show them together to make it more obvious:
 
 
 
 # Unparenthesised version, with manual step.
 
 def add(self, sub_pred_obj):
 
 sub, pred, obj = sub_pred_obj
 
 do_stuff_with(sub or pred or obj)
 
 
 
 # Parenthesised shortcut.
 
 def add(self, (sub, pred, obj)):
 
 do_stuff_with(sub or pred or obj)
 
 
 
 Both methods take a single argument, which must be a sequence of exactly 
 
 three values. The second version saves a single line, hence the first 
 
 version is longer :-)
 
 
 
 
 
 -- 
 
 Steven

-- 
https://mail.python.org/mailman/listinfo/python-list

Strange Error with pip install.

2014-07-25 Thread Bruce Whealton

Hello,
  I am using Windows 8.1 (I do have a linux box setup with virtualbox also) 
and I've used python previously but now it is giving me problems whenever I try 
to install anything from PyPI using pip.  The error I get from the command line 
is 
Cannot fetch index base URL http://pypi.python.org/simple/
Could not find any downloads that satisfy the requirement...

I tried within the MinGW environment setup when I installed Git and was given 
Git Bash as a console.  I also installed Bitnami Django stack and even in that 
environment, I get that error.

I did some Google searches but I seem to only happen when people are trying to 
install Django.  For me it is happening with django and any other pypi 
installation with pip.  

Interestingly, as I started trying to get advice with this, in the django chat 
room - at the time I was trying to get django to work in my Windows 
environment, someone suggested Vagrant.  I started creating some boxes with 
Vagrant and Puppet, Chef or bash scripts.  I had problems with this inside a 
Windows command prompt.  So, I tried it under the MinGW environment I mentioned 
above, and half the time, when I run Vagrant up, it starts the environment but 
then it tries to connect using a public key authentication.  Sometimes it will 
just give up and let me run vagrant ssh or use putty.  Other times it just 
times out.  

One idea I have is to import a VirtualBox box from Bitnami into VirtualBox, 
their Django stack.  

Does anyone have any suggestions about this problem I am having using pip 
install somepackage inside Windows (Windows 8, if that matters)?

Thanks in advance,
Bruce
-- 
https://mail.python.org/mailman/listinfo/python-list

Prob. Code Downloaded for Programming the Semantic Web (python code)

2014-07-25 Thread Bruce Whealton

Hello all,
   I downloaded some code accompanying the book Programming the Semantic 
Web.  This question is not Semantic Web related and I doubt that one needs to 
know anything about the Semantic Web to help  me with this.  It's the first 
code sample in the book, I'm embarrassed to say.  I have the code shared here 
(just one file, not the majority of the book or anything): 
http://pastebin.com/e870vjYK

OK, Eclipse with PyDev doesn't like this first line, with the function:
def add(self, (sub, pred, obj)):

It complains about the parentheses just before sub.  Simply removing them just 
moves me down to another error.  I did try using python 3.x (3.4 to be 
specific), which meant changing print statements to function calls.  Of course, 
that didn't fix the errors I was mentioning.  The text uses python 2.7.x.  

There are other places where I thought that there were too many parentheses and 
I tried removing one set of them.  For example this snippet here:

def remove(self, (sub, pred, obj)):

Remove a triple pattern from the graph.

triples = list(self.triples((sub, pred, obj)))

Are the two sets parentheses needed after self.triples?  That syntax is 
confusing to me.  It seems that it should be
triples = list(self.triples(sub, pred, obj))

The full listing is here: http://pastebin.com/e870vjYK

I agree with the authors that python is a fun and easy language to use, thus it 
is strange that I am getting stuck here.

Thanks,
Bruce
-- 
https://mail.python.org/mailman/listinfo/python-list

[no subject]

2014-03-18 Thread Nathan Bruce

Hi I was wondering how much your oxycontins are for what mg and quantity.
Also do you guys sell dilaudid?

Thank you
-- 
https://mail.python.org/mailman/listinfo/python-list

find matching contiguous text

2013-11-22 Thread bruce

Hi.

I have a xpath test that generates the text/html between two xpath
functions, basically a chunk of HTML between two dom elements.

However, it's slow. As a test, I'd like to compare the speed if I get
all the HTML following a given element, and then get all the HTML
preceding a given element.. and then do a union/join/intersection of
the text between the two text segments.

I'm trying to find an efficient/effective approach to determining the
contiguous matching text, where the text starts with the 1st line in
the test from the following element test.

Thanks
-- 
https://mail.python.org/mailman/listinfo/python-list

splitting file/content into lines based on regex termination

2013-11-07 Thread bruce

hi.

got a test file with the sample content listed below:

the content is one long string, and needs to be split into separate lines

I'm thinking the pattern to split on should be a kind of regex like::
br#45 / 58#0#
or
br#9 / 58#0
but i have no idea how to make this happen!!

if i read the content into a buf - s

import re
dat = re.compile(what goes here??).split(s)

--i'm not sure what goes in the compile() to get the process to work..

thoughts/comments would be helpful.

thanks


test dat::
10116#000#C S#S#100##001##DAY#Fund of Computing#Barrett,
William#3#MWFbr#08:00ambr#08:50ambr#3718 HBLL br#45 /
58#0#10116#000#C S#S#100##002##DAY#Fund of Computing#Barrett,
William#3#MWFbr#09:00ambr#09:50ambr#3718 HBLL br#9 /
58#0#10178#000#C S#S#124##001##DAY#Computer Systems#Roper,
Paul#3#MWFbr#11:00ambr#11:50ambr#1170 TMCB br#41 /
145#0#10178#000#C S#S#124##002##DAY#Computer Systems#Roper,
Paul#3#MWFbr#2:00pmbr#2:50pmbr#1170 TMCB br#40 /
120#0#01489#002#C S#S#142##001##DAY#Intro to Computer
Programming#Burton, Robert div class='instructors'Seppi, Kevinbr
//divspan
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: splitting file/content into lines based on regex termination

2013-11-07 Thread bruce

update...

  dat=re.compile(br#(\d+) / (\d+)#(\d+)#).split(s)

almost works..

except i get
m = 10116#000#C S#S#100##001##DAY#Fund of Computing#Barrett,
William#3#MWFbr#08:00ambr#08:50ambr#3718 HBLL
m = 45
m = 58
m = 0
m = 10116#000#C S#S#100##002##DAY#Fund of Computing#Barrett,
William#3#MWFbr#09:00ambr#09:50ambr#3718 HBLL
m = 9
m = 58
m = 0

and what i want is:
m = 10116#000#C S#S#100##001##DAY#Fund of Computing#Barrett,
William#3#MWFbr#08:00ambr#08:50ambr#3718 HBLL 45 / 58,0
m = 10116#000#C S#S#100##002##DAY#Fund of Computing#Barrett,
William#3#MWFbr#09:00ambr#09:50ambr#3718 HBLL 9 / 58,0


so i'd have the results of the compile/regex process to be added to
the split lines

thoughts/comments??

thanks



On Thu, Nov 7, 2013 at 12:15 PM, bruce badoug...@gmail.com wrote:
 hi.

 got a test file with the sample content listed below:

 the content is one long string, and needs to be split into separate lines

 I'm thinking the pattern to split on should be a kind of regex like::
 br#45 / 58#0#
 or
 br#9 / 58#0
 but i have no idea how to make this happen!!

 if i read the content into a buf - s

 import re
 dat = re.compile(what goes here??).split(s)

 --i'm not sure what goes in the compile() to get the process to work..

 thoughts/comments would be helpful.

 thanks


 test dat::
 10116#000#C S#S#100##001##DAY#Fund of Computing#Barrett,
 William#3#MWFbr#08:00ambr#08:50ambr#3718 HBLL br#45 /
 58#0#10116#000#C S#S#100##002##DAY#Fund of Computing#Barrett,
 William#3#MWFbr#09:00ambr#09:50ambr#3718 HBLL br#9 /
 58#0#10178#000#C S#S#124##001##DAY#Computer Systems#Roper,
 Paul#3#MWFbr#11:00ambr#11:50ambr#1170 TMCB br#41 /
 145#0#10178#000#C S#S#124##002##DAY#Computer Systems#Roper,
 Paul#3#MWFbr#2:00pmbr#2:50pmbr#1170 TMCB br#40 /
 120#0#01489#002#C S#S#142##001##DAY#Intro to Computer
 Programming#Burton, Robert div class='instructors'Seppi, Kevinbr
 //divspan
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: splitting file/content into lines based on regex termination

2013-11-07 Thread bruce

hi.

thanks for the reply.

tried what you suggested. what I see now, is that I print out the
lines, but not the regex data at all. my initial try, gave me the
line, and then the next items , followed by the next line, etc...

what I then tried, was to do a capture/findall of the regex, and
combine the outputs in separate loops, which will be ugly but will
work

  ff= byu2.dat
  #fff= sdsu2.dat
  with open(ff,r) as myfile:
s=myfile.read()


  s=s.replace(nbsp, )

  #with open(fff,w) as myfile2:
  #  myfile2.write(s)
#br#45 / 58#0#
#br#45 / 58#0#
  #dat1=re.compile(br#(\d+) / (\d+)#(\d+)#).search(s).findall()
  dat1=re.findall(br#(\d+) / (\d+)#(\d+)#,s)
  dat=re.compile(br#(\d+) / (\d+)#(\d+)#).split(s)
  dat2 = re.compile(rbr#\d+ / \d+#\d+#).split(s)
  #dat=re.split('(br#(\d+) / (\d+)#(\d+)#)',s)
  #dat=re.compile(br#(\d+)).split(s)


  for m in dat:
if m:
  print m = +m

  #sys.exit()

  print dat1
  print dat1
  print len(dat1)
  print dat2a
  #sys.exit()

#  for m in dat1:
#if m:
#  print m = +m
#
#  #sys.exit()

  for m in dat2:
if m:
  print m = +m

  #sys.exit()

  sys.exit()

  return


the test data is pasted to -- http://bpaste.net/show/kYzBUIfhc5023phOVmcu/

thanks
!!


On Thu, Nov 7, 2013 at 1:13 PM, MRAB pyt...@mrabarnett.plus.com wrote:
 On 07/11/2013 17:45, bruce wrote:

 update...

dat=re.compile(br#(\d+) / (\d+)#(\d+)#).split(s)

 almost works..

 except i get
 m = 10116#000#C S#S#100##001##DAY#Fund of Computing#Barrett,
 William#3#MWFbr#08:00ambr#08:50ambr#3718 HBLL
 m = 45
 m = 58
 m = 0
 m = 10116#000#C S#S#100##002##DAY#Fund of Computing#Barrett,
 William#3#MWFbr#09:00ambr#09:50ambr#3718 HBLL
 m = 9
 m = 58
 m = 0

 and what i want is:
 m = 10116#000#C S#S#100##001##DAY#Fund of Computing#Barrett,
 William#3#MWFbr#08:00ambr#08:50ambr#3718 HBLL 45 / 58,0
 m = 10116#000#C S#S#100##002##DAY#Fund of Computing#Barrett,
 William#3#MWFbr#09:00ambr#09:50ambr#3718 HBLL 9 / 58,0


 so i'd have the results of the compile/regex process to be added to
 the split lines

 thoughts/comments??

 thanks

 The split method also returns what's matched in any capture groups,
 i.e. (\d+). Try omitting the parentheses:

 dat = re.compile(rbr#\d+ / \d+#\d+#).split(s)

 You should also be using raw string literals as above (r...). It
 doesn't matter in this instance, but it might in others.



 On Thu, Nov 7, 2013 at 12:15 PM, bruce badoug...@gmail.com wrote:

 hi.

 got a test file with the sample content listed below:

 the content is one long string, and needs to be split into separate lines

 I'm thinking the pattern to split on should be a kind of regex like::
 br#45 / 58#0#
 or
 br#9 / 58#0
 but i have no idea how to make this happen!!

 if i read the content into a buf - s

 import re
 dat = re.compile(what goes here??).split(s)

 --i'm not sure what goes in the compile() to get the process to work..

 thoughts/comments would be helpful.

 thanks


 test dat::
 10116#000#C S#S#100##001##DAY#Fund of Computing#Barrett,
 William#3#MWFbr#08:00ambr#08:50ambr#3718 HBLL br#45 /
 58#0#10116#000#C S#S#100##002##DAY#Fund of Computing#Barrett,
 William#3#MWFbr#09:00ambr#09:50ambr#3718 HBLL br#9 /
 58#0#10178#000#C S#S#124##001##DAY#Computer Systems#Roper,
 Paul#3#MWFbr#11:00ambr#11:50ambr#1170 TMCB br#41 /
 145#0#10178#000#C S#S#124##002##DAY#Computer Systems#Roper,
 Paul#3#MWFbr#2:00pmbr#2:50pmbr#1170 TMCB br#40 /
 120#0#01489#002#C S#S#142##001##DAY#Intro to Computer
 Programming#Burton, Robert div class='instructors'Seppi, Kevinbr
 //divspan



 --
 https://mail.python.org/mailman/listinfo/python-list
-- 
https://mail.python.org/mailman/listinfo/python-list

[issue19456] ntpath doesn't join paths correctly when a drive is present

2013-11-03 Thread Bruce Leban


Bruce Leban added the comment:

A non-UNC windows path consists of two parts: a drive and a conventional path. 
If the drive is left out, it's relative to the current drive. If the path part 
does not have a leading \ then it's relative to the current path on that drive. 
Note that Windows has a different working dir for every drive.

x\y.txt# in dir x in current dir on current drive
\x\y.txt   # in dir x at root of current drive
E:x\y.txt  # in dir in current dir on drive E
E:\x\y.txt # in dir x at root of drive E

UNC paths are similar except \\server\share is used instead of X: and there are 
no relative paths, since the part after share always starts with a \.

Thus when joining paths, if the second path specifies a drive, then the result 
should include that drive, otherwise the drive from the first path should be 
used. The path parts should be combined with the standard logic.

Some additional test cases

tester(ntpath.join(r'C:/a/b/c/d', '/e/f'), 'C:\e\f')
tester(ntpath.join('//a/b/c/d', '/e/f'), '//a/b/e/f')
tester(ntpath.join('C:x/y', r'z'), r'C:x/y/z')
tester(ntpath.join('C:x/y', r'/z'), r'C:/z')

Andrei notes that the following is wrong but wonders what the correct answer is:

 ntpath.join('C:/a/b', 'D:x/y')
'C:/a/b\\D:x/y'

The /a/b part of the path is an absolute path on drive C and isn't 
transferable to another drive. So a reasonable result is simply 'D:x/y'. This 
matches Windows behavior. If on Windows you did

$ cd /D C:\a\b
$ cat D:x\y

it would ignore the current drive on C set by the first command and use the 
current drive on D.

tester(ntpath.join('C:/a/b', 'D:x/y'), r'D:x/y')
tester(ntpath.join('//c/a/b', 'D:x/y'), r'D:x/y')

--
nosy: +Bruce.Leban

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19456
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

trying to strip out non ascii.. or rather convert non ascii

2013-10-26 Thread bruce

hi..

getting some files via curl, and want to convert them from what i'm
guessing to be unicode.

I'd like to convert a string like this::
div class=profNamea href=ShowRatings.jsp?tid=1312168Alcántar,
Iliana/a/div

to::
div class=profNamea href=ShowRatings.jsp?tid=1312168Alcantar,
Iliana/a/div

where I convert the
 á  to  a

which appears to be a shift of 128, but I'm not sure how to accomplish this..

I've tested using the different decode/encode functions using
utf-8/ascii with no luck.

I've reviewed stack overflow, as well as a few other sites, but
haven't hit the aha moment.

pointers/comments would be welcome.

thanks
-- 
https://mail.python.org/mailman/listinfo/python-list

[issue19042] Idle: add option to autosave 'Untitled' edit window

2013-09-17 Thread Bruce Sherwood


Bruce Sherwood added the comment:

Very nice, Terry. Good point about positive vs. negative specifications. I 
think maybe your Prompt to Save versus Autosave is the best scheme, because 
one is specifying whether or not to do something active (namely, put up a 
save dialog).

--
nosy: +Bruce.Sherwood

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19042
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

web2py - running on fedora

2013-08-30 Thread bruce

Hi.

I know this is a python list, but hoping that I can find someone to
help get a base install of web2py running.

I've got an older version of fedora, running py 2.6.4 running apache v2.2

I'm simply trying to get web2py up/running, and then to interface it
with apache, so that the existing webapps (php apps) don't get screwed
up.

Thanks

-bruce
-- 
http://mail.python.org/mailman/listinfo/python-list

crawling/parsing a webpage based on dynamic javascript

2013-08-18 Thread bruce

Hi.

Looking at using python/cerely/twisted to test in parsing a test site. Also
looking at being able to parse a site created using dynamic javascript.

I've got test apps to parse a site, but I'm interested in getting a better
understanding of using multi-thread/multi-processing approaches to spin out
as many fetch processes as possible.

At the same time, I'm interested in understanding a bit better what's used
for parsing the javascript pages in the py world.

Also, rather than just point me to something like scrapy, I'm actually
interested in finding someone who's done this that I can talk to.

Heck, for the right person, I'll even toss some cash your way!!

Thanks
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: looking for a new router

2013-07-09 Thread Kumita Bruce

Agree.

Sir, this mailing list is for Python discussion. :)


On Tue, Jul 9, 2013 at 12:57 PM, Chris Angelico ros...@gmail.com wrote:

 On Tue, Jul 9, 2013 at 2:52 PM, saadharana saadhar...@gmail.com wrote:
  Hey i'm looking for a new router. I have no set budget. Only US stores. I
  have cable internet and few laptops connected to it so it needs to have a
  strong wireless internet signal. Also i do gaming as well on wireless
  internet and download many large files. Thank you for the help.

 I recommend you go to a small local store that has friendly people and
 real service, tell them what you're needing, and support local
 business with your custom. That'll be more helpful to you than asking
 on a mailing list that's about Python. :)

 ChrisA
 --
 http://mail.python.org/mailman/listinfo/python-list




-- 

Cheers,

Bruce
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Atoms, Identifiers, and Primaries

2013-04-17 Thread Bruce McGoveran

Thank you all for your thoughtful replies.  I appreciate your collective 
insight.  I didn't mean to cast the concept of recursion in a negative light - 
I'm actually comfortable with the concept, at least to some extent, and I 
appreciate the need for its use in this documentation.  I also appreciate the 
need to play with expressions at the command line, to gain a feel for how 
expressions are evaluated.  My interest in the language's formal description 
arises merely out of a desire to understand as precisely as possible what 
happens when I hit enter at the command line, or when I run a module.  
Your answers to my initial questions in this thread and the ones I posed in 
another thread (Understanding Boolean Expressions) have lead me to some 
follow-up questions.  Suppose I'm working at the command line, and I bind x to 
the value 1 and y to the value 0.  Suppose I next type x and y and hit enter.  
Python returns 0 (zero).  I'm glad I checked this before sending in this post 
because I thought it would return a value of False based on the presence of the 
and operand.  My question:  what did the interpreter have to do to evaluate the 
expression x and y and return a value of zero?
I know the lexical analyzer has to parse the stream of characters into tokens.  
I presume this parsing generates the toxens x, y, and, and a NEWLINE.  Beyond 
that, things get a little fuzzy, and it occurs to me that this fuzziness is the 
result of my looking at the expression x and y knowing full well what each 
token means and what I want done with them, whereas the interpreter won't know 
these things until it can parse the character stream and sort the tokens into 
some recognizable (and syntactically correct) order.
As I look at it, the expression x and y has two atoms, namely x and y.  x and y 
are also primaries, and they represent the most tightly bound parts of this 
expression (meaning they bind more tightly to their underlying objects than to 
the and operator).   Incidentally, how does Python figure out that the x and y 
in this expression refer to the x and y I previously bound to integer values?  
I know there's a symbol table in each execution frame.  How does Python know to 
go to that table and check for x and y?
The and token represents an operator, a boolean operator to be specific.  As I 
look at the grammar for and_test in section 5.10 of the documentation, it would 
appear that the and_test resolves via not_test's definition to two comparisons, 
which in turn resolve to or_expr, and then via a series of binary bitwise 
definitions to shift_expr, then to a_expr, then to m_expr, then to u_expr, to 
power, and then primary, and then to atom, which lands us finally at 
non-terminal identifiers (i.e. x and y themselves).  
Questions:  In working through these steps, what I have actually demonstrated?  
Is this how Python deconstructs an and statement with two operands?  Do I take 
from the fact that the progression from and_test to identifier involved 
reference to bitwise operators that the boolean testing of x and y involves a 
bitwise comparison of x and y?  I have to admit these questions are a little 
confusing; this may reflect the fact I am not exactly sure what it is I am 
trying to ask.  In general terms, I am trying to understand how Python evalutes 
the expression x and y in this context.
For my sanity's sake (and, perhaps, for yours) I will stop there.  I send 
thanks in advance for any thoughts you have on my questions.



On Tuesday, April 16, 2013 10:57:25 PM UTC-4, Bruce McGoveran wrote:
 These are terms that appear in section 5 (Expressions) of the Python online 
 documentation.  I'm having some trouble understanding what, precisely, these 
 terms mean.  I'd appreciate the forum's thoughts on these questions:
 
 
 
 1.  Section 5.2.1 indicates that an identifier occurring as an atom is a 
 name.  However, Section 2.3 indicates that identifiers are names.  My 
 question:  can an identifier be anything other than a name?
 
 
 
 2.  Section 5.3 defines primaries as the most tightly bound operations of 
 Python.  What does this mean?  In particular, if an atom is a primary, what 
 operation is the atom performing that leads to the label most tightly 
 bound?  To put it a different way, I think of atoms as things (i.e. 
 identifiers).  The documentation makes me think atoms actually do something, 
 as opposed to being things (I think I have in my mind the difference between 
 a noun and a verb as I write this).  Perhaps the doing in this case (or 
 binding, if you like) is linking (binding) the identifier to the underlying 
 object?  I think it might help if I had a better working notion of what a 
 primary is.
 
 
 
 3.  Section 5.3.1 offers this definition of an attributeref:
 
 attributeref ::= primary . identifier
 
 
 
 Now, I was at first a little concerned to see the non-terminal primary on the 
 right hand side of the definition, since primary is defined to include 
 attributeref

Understanding Boolean Expressions

2013-04-16 Thread Bruce McGoveran

Hello.  I am new to this group.  I've done a search for the topic about which 
I'm posting, and while I have found some threads that are relevant, I haven't 
found anything exactly on point that I can understand.  So, I'm taking the 
liberty of asking about something that may be obvious to many readers of this 
group. 

The relevant Python documentation reference is:  
http://docs.python.org/2/reference/expressions.html#boolean-operations.

I'm trying to make sense of the rules of or_test, and_test, and not_test that 
appear in this section.  While I understand the substance of the text in this 
section, it is the grammar definitions themselves that confuse me.  For 
example, I am not clear how an or_test can be an and_test.  Moreover, if I 
follow the chain of non-terminal references, I move from or_test, to and_test, 
to not_test, to comparison.  And when I look at the definition for comparison, 
I seem to be into bitwise comparisons.  I cannot explain this.

Perhaps an example will help put my confusion into more concrete terms.  
Suppose I write the expression if x or y in my code.  I presume this is an 
example of an or_test.  Beyond that, though, I'm not sure whether this maps to 
an and_test (the first option on the right-hand side of the rule) or to the 
or_test or and_test option (the second on the right-hand side of the rule).  

If people can offer some thoughts to put me in the right direction (or out of 
my misery), I would appreciate it.

Thank you in advance.
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Understanding Boolean Expressions

2013-04-16 Thread Bruce McGoveran

Thank you all for thoughts.  I'm just about to post another question about 
atoms and primaries.  If you have a moment to look it over, I would appreciate 
your thoughts.

Many thanks in advance.

On Tuesday, April 16, 2013 6:19:25 PM UTC-4, Bruce McGoveran wrote:
 Hello.  I am new to this group.  I've done a search for the topic about which 
 I'm posting, and while I have found some threads that are relevant, I haven't 
 found anything exactly on point that I can understand.  So, I'm taking the 
 liberty of asking about something that may be obvious to many readers of this 
 group. 
 
 
 
 The relevant Python documentation reference is:  
 http://docs.python.org/2/reference/expressions.html#boolean-operations.
 
 
 
 I'm trying to make sense of the rules of or_test, and_test, and not_test that 
 appear in this section.  While I understand the substance of the text in this 
 section, it is the grammar definitions themselves that confuse me.  For 
 example, I am not clear how an or_test can be an and_test.  Moreover, if I 
 follow the chain of non-terminal references, I move from or_test, to 
 and_test, to not_test, to comparison.  And when I look at the definition for 
 comparison, I seem to be into bitwise comparisons.  I cannot explain this.
 
 
 
 Perhaps an example will help put my confusion into more concrete terms.  
 Suppose I write the expression if x or y in my code.  I presume this is an 
 example of an or_test.  Beyond that, though, I'm not sure whether this maps 
 to an and_test (the first option on the right-hand side of the rule) or to 
 the or_test or and_test option (the second on the right-hand side of the 
 rule).  
 
 
 
 If people can offer some thoughts to put me in the right direction (or out of 
 my misery), I would appreciate it.
 
 
 
 Thank you in advance.

-- 
http://mail.python.org/mailman/listinfo/python-list

Atoms, Identifiers, and Primaries

2013-04-16 Thread Bruce McGoveran

These are terms that appear in section 5 (Expressions) of the Python online 
documentation.  I'm having some trouble understanding what, precisely, these 
terms mean.  I'd appreciate the forum's thoughts on these questions:

1.  Section 5.2.1 indicates that an identifier occurring as an atom is a name.  
However, Section 2.3 indicates that identifiers are names.  My question:  can 
an identifier be anything other than a name?

2.  Section 5.3 defines primaries as the most tightly bound operations of 
Python.  What does this mean?  In particular, if an atom is a primary, what 
operation is the atom performing that leads to the label most tightly bound?  
To put it a different way, I think of atoms as things (i.e. identifiers).  The 
documentation makes me think atoms actually do something, as opposed to being 
things (I think I have in my mind the difference between a noun and a verb as I 
write this).  Perhaps the doing in this case (or binding, if you like) is 
linking (binding) the identifier to the underlying object?  I think it might 
help if I had a better working notion of what a primary is.

3.  Section 5.3.1 offers this definition of an attributeref:
attributeref ::= primary . identifier

Now, I was at first a little concerned to see the non-terminal primary on the 
right hand side of the definition, since primary is defined to include 
attributeref in section 5.3 (so this struck me as circular).  Am I correct in 
thinking attributeref is defined this way to allow for situations in which the 
primary, whether an atom, attributeref (example:  an object on which a method 
is called that returns another object), subscription, slicing, or call, returns 
an object with property identifier?

These are, I know, long-winded questions.  I appreciate in advance any thoughts 
the group can offer.

The relevant documentation link is:  
http://docs.python.org/2/reference/expressions.html#expressions

Thanks,
Bruce
-- 
http://mail.python.org/mailman/listinfo/python-list

[issue17677] Invitation to connect on LinkedIn

2013-04-09 Thread Bruce Frederiksen


New submission from Bruce Frederiksen:

LinkedIn


Python,

I'd like to add you to my professional network on LinkedIn.

- Bruce

Bruce Frederiksen
Information Technology and Services Professional
Tampa/St. Petersburg, Florida Area

Confirm that you know Bruce Frederiksen:
https://www.linkedin.com/e/-3qcne3-hfb45911-6b/isd/12316860876/7QjJbS4a/?hs=falsetok=0GlQRpsV-Mh5I1

--
You are receiving Invitation to Connect emails. Click to unsubscribe:
http://www.linkedin.com/e/-3qcne3-hfb45911-6b/z2oU7dKDzpt2G7xQz2FC2SclHmnUGzmsk0c/goo/report%40bugs%2Epython%2Eorg/20061/I4080988955_1/?hs=falsetok=3s-0HjjjGMh5I1

(c) 2012 LinkedIn Corporation. 2029 Stierlin Ct, Mountain View, CA 94043, USA.

--
messages: 186404
nosy: dangyogi
priority: normal
severity: normal
status: open
title: Invitation to connect on LinkedIn

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue17677
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue8900] IDLE crashes if Preference set to At Startup - Open Edit Window

2013-02-04 Thread Bruce Sherwood


Bruce Sherwood added the comment:

For what it's worth (maybe not much?), the version of IDLE produced by
Guilherme Polo in the 2009 Google Summer of Code, which VPython (vpython.org)
uses under the name VIDLE, does not have any problem with starting with an
edit window and in fact I always use it that way.

Bruce Sherwood

On Mon, Feb 4, 2013 at 8:53 PM, Patrick rep...@bugs.python.org wrote:


 Patrick added the comment:

 I am seeing this as well. It does not repro 100% of the time, but
 frequently enough that its hard to get anything done. My repro is a little
 simpler and might help understanding the fix.

 Win7
 Python 3.3

 I start IDLE normally from the shortcut in the install.
 Ctrl-N to open and edit window.
 Ctrl-O to open a file.
 Select file and then Idle exits.

 As mentioned, using the menu to open the file seems to work more reliably.
 I've not had a crash that way.

 --
 nosy: +Patrick.Walters

 ___
 Python tracker rep...@bugs.python.org
 http://bugs.python.org/issue8900
 ___


--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8900
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

RE: help with simple print statement!

2012-08-24 Thread Bruce Krayenhoff

Thank-you all, it works now!


Best Wishes,

 Bruce
 C: 604-441-5791
 
https://www.google.com/calendar/embed?src=ecuiatvm07anmj3ch314if3gns%40grou
p.calendar.google.comctz=America/Vancouver My Availability 

 

From: python-list-bounces+wbrucek=gmail@python.org
[mailto:python-list-bounces+wbrucek=gmail@python.org] On Behalf Of Chris
Kaynor
Sent: August-24-12 12:54 PM
To: python-list@python.org
Subject: Re: help with simple print statement!

 

On Fri, Aug 24, 2012 at 12:43 PM, Willem Krayenhoff wbru...@gmail.com
mailto:wbru...@gmail.com  wrote:

Any idea why print isn't working here?  

 

I tried restarting my Command prompt.  Also, print doesn't work inside a
class.

 

 

 

In Python 3, print was made into a function rather than a statement for
various reasons (I'll leave it to the reader to find sources as to why). You
just need to call it rather than use it as a statement.

 

-- 
Best Wishes,

 Bruce
 C: 604-441-5791 tel:604-441-5791 
 My Availability
https://www.google.com/calendar/embed?src=ecuiatvm07anmj3ch314if3gns%40grou
p.calendar.google.comctz=America/Vancouver  


--
http://mail.python.org/mailman/listinfo/python-list

 

-- 
http://mail.python.org/mailman/listinfo/python-list

1 2 3 4 5 >

1 - 100 of 420 matches

Mail list logo