Re: Building CPython

2015-05-14 Thread Dave Angel

On 05/14/2015 01:02 PM, BartC wrote:

On 14/05/2015 17:09, Chris Angelico wrote:

On Fri, May 15, 2015 at 1:51 AM, BartC  wrote:

OK, the answer seems to be No then - you can't just trivially compile
the C
modules that comprise the sources with the nearest compiler to hand.
So much
for C's famous portability!

(Actually, I think you already lost me on your first line.)



If you want to just quickly play around with CPython's sources, I
would strongly recommend getting yourself a Linux box. Either spin up
some actual hardware with actual Linux, or grab a virtualization
engine like VMWare, VirtualBox, etc, etc, and installing into a VM.
With a Debian-based Linux (Debian, Ubuntu, Mint, etc), you should
simply be able to:

sudo apt-get build-dep python3


Actually I had VirtualBox with Ubuntu, but I don't know my way around
Linux and preferred doing things under Windows (and with all my own tools).

But it's now building under Ubuntu.

(Well, I'm not sure what it's doing exactly; the instructions said type
make, then make test, then make install, and it's still doing make test.

I hope there's a quicker way of re-building an executable after a minor
source file change, otherwise doing any sort of development is going to
be impractical.)



That's what make is good for.  It compares the datestamps of the source 
files against the obj files (etc.) and recompiles only when the source 
is newer.  (It's more complex, but that's the idea)


--
DaveA
--
https://mail.python.org/mailman/listinfo/python-list


Re: Looking for direction

2015-05-13 Thread Dave Angel

On 05/13/2015 08:45 PM, 20/20 Lab wrote:>

You accidentally replied to me, rather than the mailing list.  Please 
use reply-list, or if your mailer can't handle that, do a Reply-All, and 
remove the parts you don't want.


>
> On 05/13/2015 05:07 PM, Dave Angel wrote:
>> On 05/13/2015 07:24 PM, 20/20 Lab wrote:
>>> I'm a beginner to python.  Reading here and there.  Written a couple of
>>> short and simple programs to make life easier around the office.
>>>
>> Welcome to Python, and to this mailing list.
>>
>>> That being said, I'm not even sure what I need to ask for. I've never
>>> worked with external data before.
>>>
>>> I have a LARGE csv file that I need to process.  110+ columns, 72k
>>> rows.
>>
>> That's not very large at all.
>>
> In the grand scheme, I guess not.  However I'm currently doing this
> whole process using office.  So it can be a bit daunting.

I'm not familiar with the "office" operating system.

>>>  I managed to write enough to reduce it to a few hundred rows, and
>>> the five columns I'm interested in.
>>
>>>
>>> Now is were I have my problem:
>>>
>>> myList = [ [123, "XXX", "Item", "Qty", "Noise"],
>>> [72976, "YYY", "Item", "Qty", "Noise"],
>>> [123, "XXX" "ItemTypo", "Qty", "Noise"]]
>>>
>>
>> It'd probably be useful to identify names for your columns, even if
>> it's just in a comment.  Guessing from the paragraph below, I figure
>> the first two columns are "account" & "staff"
>
> The columns that I pull are Account, Staff, Item Sold, Quantity sold,
> and notes about the sale (notes arent particularly needed, but the
> higher ups would like them in the report)
>>
>>> Basically, I need to check for rows with duplicate accounts row[0] and
>>> staff (row[1]), and if so, remove that row, and add it's Qty to the
>>> original row.
>>
>> And which column is that supposed to be?  Shouldn't there be a number
>> there, rather than a string?
>>
>>> I really dont have a clue how to go about this.  The
>>> number of rows change based on which run it is, so I couldnt even get
>>> away with using hundreds of compare loops.
>>>
>>> If someone could point me to some documentation on the functions I 
would

>>> need, or a tutorial it would be a great help.
>>>
>>
>> Is the order significant?  Do you have to preserve the order that the
>> accounts appear?  I'll assume not.
>>
>> Have you studied dictionaries?  Seems to me the way to handle the
>> problem is to read in a row, create a dictionary with key of (account,
>> staff), and data of the rest of the line.
>>
>> Each time you read a row, you check if the key is already in the
>> dictionary.  If not, add it.  If it's already there, merge the data as
>> you say.
>>
>> Then when you're done, turn the dict back into a list of lists.
>>
> The order is irrelevant.  No, I've not really studied dictionaries, but
> a few people have mentioned it.  I'll have to read up on them and, more
> importantly, their applications.  Seems that they are more versatile
> then I thought.
>
> Thank you.

You have to realize that a tuple can be used as a key, in your case a 
tuple of Account and Staff.


You'll have to decide how you're going to merge the ItemSold, 
QuantitySold, and notes.


--
DaveA


--
DaveA
--
https://mail.python.org/mailman/listinfo/python-list


Re: Looking for direction

2015-05-13 Thread Dave Angel

On 05/13/2015 07:24 PM, 20/20 Lab wrote:

I'm a beginner to python.  Reading here and there.  Written a couple of
short and simple programs to make life easier around the office.


Welcome to Python, and to this mailing list.


That being said, I'm not even sure what I need to ask for. I've never
worked with external data before.

I have a LARGE csv file that I need to process.  110+ columns, 72k
rows.


That's not very large at all.


 I managed to write enough to reduce it to a few hundred rows, and
the five columns I'm interested in.




Now is were I have my problem:

myList = [ [123, "XXX", "Item", "Qty", "Noise"],
[72976, "YYY", "Item", "Qty", "Noise"],
[123, "XXX" "ItemTypo", "Qty", "Noise"]]



It'd probably be useful to identify names for your columns, even if it's 
just in a comment.  Guessing from the paragraph below, I figure the 
first two columns are "account" & "staff"



Basically, I need to check for rows with duplicate accounts row[0] and
staff (row[1]), and if so, remove that row, and add it's Qty to the
original row.


And which column is that supposed to be?  Shouldn't there be a number 
there, rather than a string?



I really dont have a clue how to go about this.  The
number of rows change based on which run it is, so I couldnt even get
away with using hundreds of compare loops.

If someone could point me to some documentation on the functions I would
need, or a tutorial it would be a great help.



Is the order significant?  Do you have to preserve the order that the 
accounts appear?  I'll assume not.


Have you studied dictionaries?  Seems to me the way to handle the 
problem is to read in a row, create a dictionary with key of (account, 
staff), and data of the rest of the line.


Each time you read a row, you check if the key is already in the 
dictionary.  If not, add it.  If it's already there, merge the data as 
you say.


Then when you're done, turn the dict back into a list of lists.

--
DaveA
--
https://mail.python.org/mailman/listinfo/python-list


Re: Python file structure

2015-05-12 Thread Dave Angel

On 05/12/2015 03:58 PM, zljubisic...@gmail.com wrote:

On Tuesday, May 12, 2015 at 9:49:20 PM UTC+2, Ned Batchelder wrote:




If you need to use globals, assign them inside a parse_arguments
function that has a "global" statement in it.

This advice is consistent with Chris' "define things before they
are used."  It does it by defining everything before anything is
run.

As a side note, if you are going to have code at the top-level of
the file, then there's no point in the "if __name__..." clause.
That clause is designed to make a file both runnable and importable.
But your top-level code makes the file very difficult to import.

--Ned.


It makes sense. The only drawback is that variables are global


only "if you need to use globals".  You can't have it both ways.  If 
they're needed, it's because you feel they must be changeable elsewhere 
in the program.  I try to avoid global variables, but make no such 
restraints on the use of global constants.  So for example, the argument 
parsing logic could very well export something as global, but it'd be 
all uppercase and anyone changing it subsequently would get their hand 
slapped by the linter.



so they could be changed anywhere in the program.
I also agree that it is more python approach.

Thanks to both of you.




--
DaveA
--
https://mail.python.org/mailman/listinfo/python-list


Re: Feature Request: Reposition Execution

2015-05-11 Thread Dave Angel

On 05/11/2015 08:35 AM, Steven D'Aprano wrote:

On Mon, 11 May 2015 09:57 pm, Dave Angel wrote:


On 05/11/2015 07:46 AM, Skybuck Flying wrote:

Hello,

Sometimes it can be handy to "interrupt/reset/reposition" a running
script.

For example something externally goes badly wrong.



os.kill()

then in your process, handle the exception, and do whatever you think is
worthwhile.



Are you suggesting that the app sends itself a signal?

Is that even allowed?



No idea if it's allowed.  I didn't notice his sample was multithreaded, 
as i grabbed on the "externally goes badly wrong".




--
DaveA
--
https://mail.python.org/mailman/listinfo/python-list


Re: Feature Request: Reposition Execution

2015-05-11 Thread Dave Angel

On 05/11/2015 07:46 AM, Skybuck Flying wrote:

Hello,

Sometimes it can be handy to "interrupt/reset/reposition" a running script.

For example something externally goes badly wrong.



os.kill()

then in your process, handle the exception, and do whatever you think is 
worthwhile.





--
DaveA
--
https://mail.python.org/mailman/listinfo/python-list


Re: SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 2-3: truncated \UXXXXXXXX escape

2015-05-10 Thread Dave Angel

On 05/10/2015 05:10 PM, zljubisic...@gmail.com wrote:

No, we can't see what ROOTDIR is, since you read it from the config
file.  And you don't show us the results of those prints.  You don't
even show us the full exception, or even the line it fails on.


Sorry I forgot. This is the output of the script:

C:\Python34\python.exe C:/Users/zoran/PycharmProjects/mm_align/bckslash_test.py
C:\Users\zoran\hrt
Traceback (most recent call last):
   File "C:/Users/zoran/PycharmProjects/mm_align/bckslash_test.py", line 43, in 

 with open(src_file, mode='w', encoding='utf-8') as s_file:
FileNotFoundError: [Errno 2] No such file or directory: 
'C:\\Users\\zoran\\hrt\\src_70._godišnjica_pobjede_nad_fašizmom_Zašto_većina_čelnika_Europske_unije_bojkotira_vojnu_paradu_u_Moskvi__Kako_će_se_obljetnica_pobjede_nad_nacističkom_Njemačkom_i_njenim_satelitima_obilježiti_u_našoj_zemlji__Hoće_li_Josip_Broz_Tito_o.txt'
70._godišnjica_pobjede_nad_fašizmom_Zašto_većina_čelnika_Europske_unije_bojkotira_vojnu_paradu_u_Moskvi__Kako_će_se_obljetnica_pobjede_nad_nacističkom_Njemačkom_i_njenim_satelitima_obilježiti_u_našoj_zemlji__Hoće_li_Josip_Broz_Tito_o
260 
C:\Users\zoran\hrt\src_70._godišnjica_pobjede_nad_fašizmom_Zašto_većina_čelnika_Europske_unije_bojkotira_vojnu_paradu_u_Moskvi__Kako_će_se_obljetnica_pobjede_nad_nacističkom_Njemačkom_i_njenim_satelitima_obilježiti_u_našoj_zemlji__Hoće_li_Josip_Broz_Tito_o.txt
260 
C:\Users\zoran\hrt\des_70._godišnjica_pobjede_nad_fašizmom_Zašto_većina_čelnika_Europske_unije_bojkotira_vojnu_paradu_u_Moskvi__Kako_će_se_obljetnica_pobjede_nad_nacističkom_Njemačkom_i_njenim_satelitima_obilježiti_u_našoj_zemlji__Hoće_li_Josip_Broz_Tito_o.txt

Process finished with exit code 1

Cfg file has the following contents:

C:\Users\zoran\PycharmProjects\mm_align\hrt3.cfg contents
[Dir]
ROOTDIR = C:\Users\zoran\hrt


I doubt that the problem is in the ROODIR value, but of course nothing
in your program bothers to check that that directory exists.  I expect
you either have too many characters total, or the 232th character is a
strange one.  Or perhaps title has a backslash in it (you took care of
forward slash).


How to determine that?


Probably by calling os.path.isdir()




While we're at it, if you do have an OS limitation on size, your code is
truncating at the wrong point.  You need to truncate the title based on
the total size of src_file and dst_file, and since the code cannot know
the size of ROOTDIR, you need to include that in your figuring.


Well, in my program I am defining a file name as category-id-description.mp3.
If the file is too long I am cutting description (it wasn't clear from my 
example).


Since you've got non-ASCII characters in that name, the utf-8 version of 
the name will be longer.  I don't run Windows, but perhaps it's just a 
length problem after all.




--
DaveA
--
https://mail.python.org/mailman/listinfo/python-list


Re: functions, optional parameters

2015-05-10 Thread Dave Angel

On 05/09/2015 11:33 PM, Chris Angelico wrote:

On Sun, May 10, 2015 at 12:45 PM, Steven D'Aprano
 wrote:

This is the point where some people try to suggest some sort of complicated,
fragile, DWIM heuristic where the compiler tries to guess whether the user
actually wants the default to use early or late binding, based on what the
expression looks like. "0 is an immutable int, use early binding; [] is a
mutable list, use late binding." sort of thing. Such a thing might work
well for the obvious cases, but it would be a bugger to debug and
work-around for the non-obvious cases when it guesses wrong -- and it will.


What you could have is "late-binding semantics, optional early binding
as an optimization but only in cases where the result is
indistinguishable". That would allow common cases (int/bool/str/None
literals) to be optimized, since there's absolutely no way for them to
evaluate differently.



Except for literals, True, False and None, I can't see any way to 
optimize such a thing.  Just because the name on the right side 
references an immutable object at compile time, it doesn't follow that 
it'll still be the same object later.


Unless late binding means something very different than I understood.

--
DaveA
--
https://mail.python.org/mailman/listinfo/python-list


Re: getting fieldnames from Dictreader before reading lines

2015-05-10 Thread Dave Angel

On 05/09/2015 09:51 PM, Vincent Davis wrote:

On Sat, May 9, 2015 at 5:55 PM, Dave Angel  wrote:


1) you're top-posting, putting your response  BEFORE the stuff you're

responding to.


I responded to my own email, seemed ok to top post on myself saying it was
resolved.


Yeah, I overreacted.  There has been a lot of top-posting lately, but if 
a message is expected to be the end of the thread, I shouldn't care what 
it looks like.  Sorry.






2) both messages are in html, which thoroughly messed up parts of your

error messages.

I am posting from google mail (not google groups). Kindly let me know if
this email is also html.


Still html.  It didn't matter in this case, but html can cause painful 
formatting.  And worse, it'll be different for different recipients.  So 
some people will see exactly what was sent, while others will see things 
messed up a little, or a lot.  The formatting can get messed up by the 
sender's mailer, or by the receiver's mail program, or both.


Dennis showed you what your current message looked like, so I won't 
repeat the whole thing.  But your original message had a 
"multipart/alternative" which gets interpreted as plain text by many 
mail programs (including mine == Thunderbird), but it was already 
misformatted, with newlines in the wrong places.  And it had an "html", 
which was correctly formatted, or at least could be interpreted 
correctly by mine.


So the output of google mail was in this case just plain wrong.

Any reason you're using google mail when you have your own email address 
+ domain?



--
DaveA
--
https://mail.python.org/mailman/listinfo/python-list


Re: getting fieldnames from Dictreader before reading lines

2015-05-09 Thread Dave Angel

On 05/09/2015 07:01 PM, Vincent Davis wrote:

Not sure what I was doing wrong, it seems to work now.



I still see two significant things wrong:

1) you're top-posting, putting your response  BEFORE the stuff you're 
responding to.


2) both messages are in html, which thoroughly messed up parts of your 
error messages.




On Sat, May 9, 2015 at 4:46 PM, Vincent Davis 
wrote:


I am reading a file with Dictreader and writing a new file. I want use the
fieldnames in the Dictwriter from the reader. See below How should I be
doing this?



--
DaveA
--
https://mail.python.org/mailman/listinfo/python-list


Re: Jython from bathc file?

2015-05-09 Thread Dave Angel

On 05/09/2015 05:04 PM, vjp2...@at.biostrategist.dot.dot.com wrote:

Thanks.. I suspected it wasn't meant to be taken as in the file

THe one thing I'm not sure if Jython is suppsosedto keep running
after the initisl stuff is loaded in..


To put the question in purely DOS terms if you run a program can you pipe it
some commands and then keep it running to take the remaining commands from
the console?



That's not a built-in feature of cmd.exe.  However, it wouldn't be hard 
to write a data source (funny.exe) that took data from a file, and then 
from stdin, sending both in progression to stdout.  Then you'd run the 
two programs as:


funny.exe infile.txt | newprog.exe


--
DaveA
--
https://mail.python.org/mailman/listinfo/python-list


Re: SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 2-3: truncated \UXXXXXXXX escape

2015-05-09 Thread Dave Angel

On 05/09/2015 06:31 AM, zljubisic...@gmail.com wrote:


title = title[:232]
title = title.replace(" ", "_").replace("/", "_").replace("!", "_").replace("?", 
"_")\
 .replace('"', "_").replace(':', "_").replace(',', 
"_").replace('"', '')\
 .replace('\n', '_').replace(''', '')

print(title)

src_file = os.path.join(ROOTDIR, 'src_' + title + '.txt')
dst_file = os.path.join(ROOTDIR, 'des_' + title + '.txt')

print(len(src_file), src_file)
print(len(dst_file), dst_file)

with open(src_file, mode='w', encoding='utf-8') as s_file:
 s_file.write('test')


shutil.move(src_file, dst_file)

It works, but if you change title = title[:232] to title = title[:233], you will get 
"FileNotFoundError: [Errno 2] No such file or directory".
As you can see ROOTDIR contains \U.


No, we can't see what ROOTDIR is, since you read it from the config 
file.  And you don't show us the results of those prints.  You don't 
even show us the full exception, or even the line it fails on.


I doubt that the problem is in the ROODIR value, but of course nothing 
in your program bothers to check that that directory exists.  I expect 
you either have too many characters total, or the 232th character is a 
strange one.  Or perhaps title has a backslash in it (you took care of 
forward slash).


While we're at it, if you do have an OS limitation on size, your code is 
truncating at the wrong point.  You need to truncate the title based on 
the total size of src_file and dst_file, and since the code cannot know 
the size of ROOTDIR, you need to include that in your figuring.





--
DaveA
--
https://mail.python.org/mailman/listinfo/python-list


Re: calling base class method fetches no results

2015-05-09 Thread Dave Angel

On 05/09/2015 03:59 AM, david jhon wrote:

Hi, I am sorry for sending in five attachments, I cloned the code from here
: Let me explain it here:



Please don't top-post.  Your earlier problem description, which I could 
make no sense of, is now located after your later "clarification".


Thanks for eliminating the attachments.  Many cannot see them.  And for 
extracting only part of the code into the message.  It's still too much 
for me, but others may manage it okay.  To me, it seems likely that most 
of that code will not have any part in the problem you're describing. 
And in some places you have code that's missing its context.


Now, eliminate the pieces of code that are irrelevant to your question, 
and state the problem in terms that make sense.  Somebody is 
instantiating an object.  Exactly which class is being used for that? 
Somebody else is calling a particular method (be specific, rather than 
just saying "some method"), and it's giving the wrong results.  And 
apparently, those wrong results depend on which source file something 
happens in.




Routing Base class defined in DCRouting.py:

import logging
from copy import copy

class Routing(object):
 '''Base class for data center network routing.

 Routing engines must implement the get_route() method.
 '''

 def __init__(self, topo):
 '''Create Routing object.

 @param topo Topo object from Net parent
 '''
 self.topo = topo

 def get_route(self, src, dst, hash_):
 '''Return flow path.

 @param src source host
 @param dst destination host
 @param hash_ hash value

 @return flow_path list of DPIDs to traverse (including hosts)
 '''
 raise NotImplementedError

 def routes(self, src, dst):
 ''' Return list of paths

 Only works for Fat-Tree topology

 @ param src source host
 @ param dst destination host

 @ return list of DPIDs (including inputs)
 '''

 complete_paths = [] # List of complete dpid routes

 src_paths = { src : [[src]] }
 dst_paths = { dst : [[dst]] }

 dst_layer = self.topo.layer(dst)
 src_layer = self.topo.layer(src)

 lower_layer = src_layer
 if dst_layer > src_layer:
 lower_layer = dst_layer


 for front_layer in range(lower_layer-1, -1, -1):
 if src_layer > front_layer:
 # expand src frontier
 new_src_paths = {}
 for node in sorted(src_paths):
 path_list = src_paths[node]
 for path in path_list:
 last_node = path[-1]
 for frontier_node in
self.topo.upper_nodes(last_node):
 new_src_paths[frontier_node] = [path +
[frontier_node]]

 if frontier_node in dst_paths:
 dst_path_list = dst_paths[frontier_node]
 for dst_path in dst_path_list:
 dst_path_copy = copy ( dst_path )
 dst_path_copy.reverse()
 complete_paths.append( path +
dst_path_copy)
 src_paths = new_src_paths

 if dst_layer > front_layer:
 # expand dst frontier
 new_dst_paths = {}
 for node in sorted(dst_paths):
 path_list = dst_paths[node]
 for path in path_list:
 last_node = path[-1]
 for frontier_node in
self.topo.upper_nodes(last_node):
 new_dst_paths[frontier_node] = [ path +
[frontier_node]]

 if frontier_node in src_paths:
 src_path_list = src_paths[frontier_node]
 dst_path_copy = copy( path )
 dst_path_copy.reverse()
 for src_path in src_path_list:
 complete_paths.append( src_path +
dst_path_copy)

 dst_paths = new_dst_paths

 if complete_paths:
 return complete_paths
class HashedRouting(Routing):
 ''' Hashed routing '''

 def __init__(self, topo):
 self.topo = topo

 def get_route(self, src, dst, hash_):
 ''' Return flow path. '''

 if src == dst:
 return [src]

 paths = self.routes(src,dst)
 if paths:
 #print 'hash_:', hash_
 choice = hash_ % len(paths)
 #print 'choice:', choice
 path = sorted(paths)[choice]
 #print 'path:', path
 return path

>
Instantiated in util.py:

from DCTopo import FatTreeTopo
from mininet.util import makeNumeric
from DCRouting import H

Re: Encrypt python files

2015-05-08 Thread Dave Angel

On 05/08/2015 06:59 AM, Denis McMahon wrote:

On Wed, 06 May 2015 00:23:39 -0700, Palpandi wrote:


On Wednesday, May 6, 2015 at 12:07:13 PM UTC+5:30, Palpandi wrote:

Hi,

What are the ways to encrypt python files?


No, I just want to hide the scripts from others.


You can do that by deleting the scripts. Make sure you use a secure
deletion tool.

I'm not aware of any mechanism for encrypted executable python scripts.
You can obfuscate the code, but you can't encrypt it because the python
interpreter needs the script file to execute it.

The same holds true for any mechanism designed to encrypt executable code
regardless of whether it's script or compiled. At the lowest level the
processor only understands the instruction set, and encrypted code has to
be decrypted to execute.

As the decryption method is always available to anyone who has legitimate
access to execute the code, it's impossible to hide the code at that
point.

Example - if I give you an encrypted binary to run on your system, it
either has to be unencryptable


It'd be clearer if you used  decryptable, since unencryptable has a very 
different meaning.


http://en.wiktionary.org/wiki/unencryptable



using tools you already have, or using a
built in unencrypter, both of which you have access to and can use to
unencrypt the encrypted executable code.



likewise decrypter and decrypt.

--
DaveA
--
https://mail.python.org/mailman/listinfo/python-list


Re: Is this unpythonic?

2015-05-08 Thread Dave Angel

On 05/08/2015 06:53 AM, Frank Millman wrote:


"Steven D'Aprano"  wrote in message
news:554c8b0a$0$12992$c3e8da3$54964...@news.astraweb.com...

On Fri, 8 May 2015 06:01 pm, Frank Millman wrote:


Hi all


[...]


However, every time I look at my own code, and I see   "def x(y, z=[]):
."   it looks wrong because I have been conditioned to think of it as
a gotcha.




It might be appropriate to define the list at top-level, as

EMPTY_LIST=[]

and in your default argument as
def x(y, z=EMPTY_LIST):

and with the all-caps, you're thereby promising that nobody will modify 
that list.


(I'd tend to do the None trick, but I think this alternative would be 
acceptable)


--
DaveA
--
https://mail.python.org/mailman/listinfo/python-list


Re: asyncio: What is the difference between tasks, futures, and coroutines?

2015-05-08 Thread Dave Angel

On 05/08/2015 02:42 AM, Chris Angelico wrote:

On Fri, May 8, 2015 at 4:36 PM, Rustom Mody  wrote:

On Friday, May 8, 2015 at 10:39:38 AM UTC+5:30, Chris Angelico wrote:

Why have the concept of a procedure?


On Friday, Chris Angelico ALSO wrote:

With print(), you have a conceptual procedure...


So which do you want to stand by?


A procedure, in Python, is simply a function which returns None.
That's all. It's not any sort of special concept. It doesn't need to
be taught. If your students are getting confused by it, stop teaching
it!


One thing newbies get tripped up by is having some path through their 
code that doesn't explicitly return.  And in Python that path therefore 
returns None.  It's most commonly confusing when there are nested ifs, 
and one of the "inner ifs" doesn't have an else clause.


Anyway, it's marginally more useful to that newbie if the compiler would 
produce an error instead of later seeing a runtime error due to an 
unexpected None result.


I don't think Python would be improved by detecting such a condition and 
reporting on it.  That's a job for a linter, or a style guide program.


No different than the compile time checks for variable type that most 
languages impose.  They don't belong in Python.


--
DaveA
--
https://mail.python.org/mailman/listinfo/python-list


Re: PEP idea: On Windows, subprocess should implicitly support .bat and .cmd scripts by using FindExecutable from win32 API

2015-05-07 Thread Dave Angel

On 05/07/2015 06:24 AM, Chris Angelico wrote:

On Thu, May 7, 2015 at 8:10 PM, Marko Rauhamaa  wrote:

Stefan Zimmermann :


And last but not least, Popen behavior on Windows makes it difficult
to write OS-independent Python code which calls external commands that
are not binary by default:


Then, write OS-dependent Python code.

I don't think it's Python's job to pave over OS differences. Java does
that by not offering precious system facilities -- very painful. Python
is taking steps in that direction, but I hope it won't go too far.


On the contrary, I think it *is* a high level language's job to pave
over those differences. Portable C code generally has to have a
whopping 'configure' script that digs into your hardware, OS, library,
etc availabilities, and lets you figure out which way to do things.
Python code shouldn't need to worry about that. You don't need to care
whether you're on a 32-bit or 64-bit computer; you don't need to care
whether it's an Intel chip or a RISCy one; you shouldn't have to
concern yourself with the difference between BSD networking and
WinSock. There'll be a handful of times when you do care, and for
those, it's nice to have some facilities exposed; but the bulk of code
shouldn't need to know about the platform it's running on.

Java went for a philosophy of "write once, run anywhere" in its early
days, and while that hasn't exactly been stuck to completely, it's
still the reasoning behind the omission of certain system facilities.
Python accepts and understands that there will be differences, so you
can't call os.getuid() on Windows, and there are a few restrictions on
the subprocess module if you want maximum portability, but the bulk of
your code won't be any different on Linux, Windows, Mac OS, OS/2,
Amiga, OS/400, Solaris, or a MicroPython board.

ChrisA



It's a nice goal.  But these aren't OS features in Windows, they're 
shell features.  And there are several shells.  If the user has 
installed a different shell, is it Python's job to ignore it and 
simulate what cmd.exe does?


Seems to me that's what shell=True is for.  it signals Python that we're 
willing to trust the shell to do whatever magic it chooses, from adding 
extensions, to calling interpreters, to changing search order, to 
parsing the line in strange ways, to setting up temporary environment 
contexts, etc.


If there were just one shell, it might make sense to emulate its 
features.  Or it might make sense to contort its features to look like a 
Unix shell.  But with multiple possibilities, seems that's more like 
space for a 3rd party library.


--
DaveA
--
https://mail.python.org/mailman/listinfo/python-list


Re: PEP idea: On Windows, subprocess should implicitly support .bat and .cmd scripts by using FindExecutable from win32 API

2015-05-06 Thread Dave Angel

On 05/06/2015 06:11 PM, Stefan Zimmermann wrote:

Hi.

I don't like that subprocess.Popen(['command']) only works on Windows if there 
is a command.exe in %PATH%.


 As a Windows user you would normally expect that also command.bat and 
command.cmd can be run that way.




and command.com.

If it's really unfortunate that you picked "command" for your sample 
program name.  Since command.com was the shell in MSDOS, I was about to 
point you to COMSPEC to address your problem.


There's nothing Windows-specific about that behaviour.  In Linux, there 
are bash commands that can only be run by using shell=True.  Fortunately 
Popen didn't make the mistake of pretending it's a shell.


There is lots more to running a batch file than launching it.  The whole 
syntax of the rest of the commandline differs when you're doing that.



There are simple workarounds like Popen(..., shell=True) but that is a heavy 
overhead for .exe files.


And the reason there's such an overhead is because you're requesting the 
services of the shell.  If you don't need those services, use shell=False.




Currently I use pywin32 and call Popen([win32api.FindExecutable('command')[1]]) 
as a workaround. This has zero overhead.

It should be default for Popen to call FindExecutable internally.

Was this discussed before?
Is it worth a PEP?
Or at least an issue?

Cheers,
Stefan




--
DaveA
--
https://mail.python.org/mailman/listinfo/python-list


Re: extracting zip item to in-memory

2015-05-06 Thread Dave Angel

On 05/06/2015 04:27 PM, noydb wrote:

I have a zip file containing several files and I want to extract out just the 
.xml file.  I have that code.  Curious if this xml file can be extracted into 
memory.  If so, how to?  I only need to move the file around, and maybe read 
some tags.

Thanks for any help!

python 2.7



See https://docs.python.org/2.7/library/zipfile.html#zipfile.ZipFile.open

To open a particular member and get back a file-like object.

Once you have that, you can use the  read() method of that object.

Once you've coded this, if it doesn't work, post what you've got with a 
description of what doesn't work, and somebody here will be able to fix 
it up.



--
DaveA
--
https://mail.python.org/mailman/listinfo/python-list


Re: Throw the cat among the pigeons

2015-05-06 Thread Dave Angel

On 05/06/2015 11:36 AM, Alain Ketterlin wrote:

Yes, plus the time for memory allocation. Since the code uses "r *=
...", space is reallocated when the result doesn't fit. The new size is
probably proportional to the current (insufficient) size. This means
that overall, you'll need fewer reallocations, because allocations are
made in bigger chunks.


That sounds plausible, but  a=5; a*=4  does not update in place. It 
calculates and creates a new object.  Updating lists can work as you 
say, but an int is immutable.


It's an optimization that might be applied if the code generator were a 
lot smarter, (and if the ref count is exactly 1), but it would then be 
confusing to anyone who used id().


--
DaveA
--
https://mail.python.org/mailman/listinfo/python-list


Re: Throw the cat among the pigeons

2015-05-06 Thread Dave Angel

On 05/06/2015 09:55 AM, Chris Angelico wrote:

On Wed, May 6, 2015 at 11:12 PM, Dave Angel  wrote:

I had guessed that the order of multiplication would make a big difference,
once the product started getting bigger than the machine word size.

Reason I thought that is that if you multiply starting at the top value (and
end with multiplying by 2) you're spending more of the time multiplying
big-ints.

That's why I made sure that both Cecil's and my implementations were
counting up, so that wouldn't be a distinction.

I'm still puzzled, as it seems your results imply that big-int*int is faster
than int*int where the product is also int.


Are you using Python 2 or Python 3 for your testing? In Py3, there's
no type difference, and barely no performance difference as you cross
the word-size boundary. (Bigger numbers are a bit slower to work with,
but not hugely.)



Both Cecil and I are using 3.x  I'm using 3.4 in particular.  And I know 
int covers both big-int and int32.  that's why I called it big-int, 
rather than long.


I was, however, mistaken.  it's not that threshold that we're crossing 
here, but another one, for MUCH larger numbers.  factorial of 10 and 
of 20 have 456473 and 97350 digits, respectively.  In binary, that 
would be about 190k bytes and 404k bytes, respectively.


I was seeing factorial of 20 taking about 4.5 times as long as 
factorial of 10.  All the other increments seemed fairly proportional.


I'll bet the difference is something like the memory allocator using a 
different algorithm for blocks above 256k.  Or the cache logic hitting a 
threshold.


If it's caching, of course the threshold will differ wildly between 
machine architectures.


If it's the memory allocator, that could easily vary between Python 
versions as well.





--
DaveA
--
https://mail.python.org/mailman/listinfo/python-list


Re: Throw the cat among the pigeons

2015-05-06 Thread Dave Angel

On 05/06/2015 02:26 AM, Steven D'Aprano wrote:

On Wednesday 06 May 2015 14:05, Steven D'Aprano wrote:



My interpretation of this is that the difference has something to do with
the cost of multiplications. Multiplying upwards seems to be more expensive
than multiplying downwards, a result I never would have predicted, but
that's what I'm seeing. I can only guess that it has something to do with
the way multiplication is implemented, or perhaps the memory management
involved, or something. Who the hell knows?



I had guessed that the order of multiplication would make a big 
difference, once the product started getting bigger than the machine 
word size.


Reason I thought that is that if you multiply starting at the top value 
(and end with multiplying by 2) you're spending more of the time 
multiplying big-ints.


That's why I made sure that both Cecil's and my implementations were 
counting up, so that wouldn't be a distinction.


I'm still puzzled, as it seems your results imply that big-int*int is 
faster than int*int where the product is also int.


That could use some more testing, though.

I still say a cutoff of about 10% is where we should draw the line in an 
interpretive system.  Below that, you're frequently measuring noise and 
coincidence.


Remember the days when you knew how many cycles each assembly 
instruction took, and could simply add them up to compare algorithms?



--
DaveA
--
https://mail.python.org/mailman/listinfo/python-list


Re: Throw the cat among the pigeons

2015-05-05 Thread Dave Angel

On 05/05/2015 05:39 PM, Ian Kelly wrote:

On Tue, May 5, 2015 at 3:23 PM, Ian Kelly  wrote:

On Tue, May 5, 2015 at 3:00 PM, Dave Angel  wrote:

def loop(func, funcname, arg):
 start = time.time()
 for i in range(repeats):
 func(arg, True)
 print("{0}({1}) took {2:7.4}".format(funcname, arg, time.time()-start))

 start = time.time()
 for i in range(repeats):
 func(arg)
 print("{0}({1}) took {2:7.4}".format(funcname, arg, time.time()-start))


Note that you're explicitly passing True in one case but leaving the
default in the other. I don't know whether that might be responsible
for the difference you're seeing.


I don't think that's the cause, but I do think that it has something
to do with the way the timing is being run. When I run your loop
function, I do observe the difference. If I reverse the order so that
the False case is tested first, I observe the opposite. That is, the
slower case is consistently the one that is timed *first* in the loop
function, regardless of which case that is.



I created two functions and called them with Timeit(), and the 
difference is now below 3%


And when I take your lead and double the loop() function so it runs each 
test twice, I get steadily decreasing numbers.


I think all of this has been noise caused by the caching of objects 
including function objects.  I was surprised by this, as the loops are 
small enough I'd figure the function object would be fully cached the 
first time it was called.


--
DaveA
--
https://mail.python.org/mailman/listinfo/python-list


Re: Throw the cat among the pigeons

2015-05-05 Thread Dave Angel

On 05/05/2015 04:30 PM, Ian Kelly wrote:

On Tue, May 5, 2015 at 12:45 PM, Dave Angel  wrote:

When the "simple" is True, the function takes noticeably and consistently
longer.  For example, it might take 116 instead of 109 seconds.  For the
same counts, your code took 111.


I can't replicate this. What version of Python is it, and what value
of x are you testing with?


I've looked at dis.dis(factorial_iterative), and can see no explicit reason
for the difference.


My first thought is that maybe it's a result of the branch. Have you
tried swapping the branches, or reimplementing as separate functions
and comparing?



Logic is quite simple:


def factorial_iterative(x, simple=False):
assert x >= 0
result = 1
j=2
if not simple:
for i in range(2, x + 1):
#print("range value is of type", type(i), "and value", i)
#print("ordinary value is of type", type(j), "and value", j)
result *= i
j += 1
else:
for i in range(2, x + 1):
result *= j
j += 1

return result

def loop(func, funcname, arg):
start = time.time()
for i in range(repeats):
func(arg, True)
print("{0}({1}) took {2:7.4}".format(funcname, arg, time.time()-start))

start = time.time()
for i in range(repeats):
func(arg)
print("{0}({1}) took {2:7.4}".format(funcname, arg, time.time()-start))

repeats = 1

and arg is 10**4
loop(factorial_iterative,  "factorial_iterative  ", arg)

My actual program does the same thing with other versions of the 
function, including Cecil's factorial_tail_recursion, and my optimized 
version of that.



Python 3.4.0 (default, Apr 11 2014, 13:05:11)
[GCC 4.8.2] on linux

factorial_iterative  (10) took   3.807
factorial_iterative  (10) took   3.664

factorial_iterative  (20) took   17.07
factorial_iterative  (20) took15.3

factorial_iterative  (30) took   38.93
factorial_iterative  (30) took   36.01


Note that I test them in the opposite order of where they appear in the 
function.  That's because I was originally using the simple flag to test 
an empty loop.  The empty loop is much quicker either way, so it's not 
the issue.  (But if it were, the for-range version is much quicker).


I think I'll take your last suggestion and write separate functions.



--
DaveA
--
https://mail.python.org/mailman/listinfo/python-list


Re: Stripping unencodable characters from a string

2015-05-05 Thread Dave Angel

On 05/05/2015 02:19 PM, Paul Moore wrote:

You need to specify that you're using Python 3.4 (or whichever) when 
starting a new thread.



I want to write a string to an already-open file (sys.stdout, typically). 
However, I *don't* want encoding errors, and the string could be arbitrary 
Unicode (in theory). The best way I've found is

 data = data.encode(file.encoding, errors='replace').decode(file.encoding)
 file.write(data)

(I'd probably use backslashreplace rather than replace, but that's a minor 
point).

Is that the best way? The multiple re-encoding dance seems a bit clumsy, but it 
was the best I could think of.

Thanks,
Paul.



If you're going to take charge of the encoding of the file, why not just 
open the file in binary, and do it all with

file.write(data.encode( myencoding, errors='replace') )

i can't see the benefit of two encodes and a decode just to write a 
string to the file.


Alternatively, there's probably a way to open the file using 
codecs.open(), and reassign it to sys.stdout.



--
DaveA
--
https://mail.python.org/mailman/listinfo/python-list


Re: Throw the cat among the pigeons

2015-05-05 Thread Dave Angel

On 05/05/2015 12:18 PM, Cecil Westerhof wrote:



Well, I did not write many tail recursive functions. But what surprised
me was that for large values the ‘tail recursive’ version was more
efficient as the iterative version. And that was with myself
implementing the tail recursion. I expect the code to be more
efficient when the compiler implements the tail recursion.




You've said that repeatedly, so I finally took a look at your webpage

https://github.com/CecilWesterhof/PythonLibrary/blob/master/mathDecebal.py

I didn't have your framework to call the code, so I just extracted some 
functions and did some testing.  I do see some differences, where the 
so-called tail_recursive functions are sometimes faster, but I did some 
investigating to try to determine why.



I came up with the conclusion that sometimes the multiply operation 
takes longer than other times.  And in particular, i can see more 
variation between the two following loops than between your two functions.



def factorial_iterative(x, simple=False):
assert x >= 0
result = 1
j=2
if not simple:
for i in range(2, x + 1):
result *= i
j += 1
else:
for i in range(2, x + 1):
result *= j
j += 1
pass

return result

When the "simple" is True, the function takes noticeably and 
consistently longer.  For example, it might take 116 instead of 109 
seconds.  For the same counts, your code took 111.


I've looked at dis.dis(factorial_iterative), and can see no explicit 
reason for the difference.




--
DaveA
--
https://mail.python.org/mailman/listinfo/python-list


Re: Step further with filebasedMessages

2015-05-05 Thread Dave Angel

On 05/05/2015 11:25 AM, Cecil Westerhof wrote:



I have a file with quotes and a file with tips. I want to place random
messages from those two (without them being repeated to soon) on my
Twitter page. This I do with ‘get_random_message’. I also want to put
the first message of another file and remove it from the file. For
this I use ‘dequeue_message’.



Removing lines from the start of a file is an n-squared operation. 
Sometiomes it pays to reverse the file once, and just remove from the 
end.  Truncating a file doesn't require the whole thing to be rewritten, 
nor risk losing the file if the make-new-file-rename-delete-old isn't 
quite done right.


Alternatively, you could overwrite the line, or more especially the 
linefeed before it.  Then you always do two readline() calls, using the 
second one's result.


Various other games might include storing an offset at the begin of 
file, so you start by reading that, doing a seek to the place you want, 
and then reading the new line from there.



Not recommending any of these, just bringing up alternatives.

--
DaveA
--
https://mail.python.org/mailman/listinfo/python-list


Re: Bitten by my C/Java experience

2015-05-04 Thread Dave Angel

On 05/04/2015 04:28 PM, Cecil Westerhof wrote:

Op Monday 4 May 2015 21:39 CEST schreef Ian Kelly:


On Mon, May 4, 2015 at 11:59 AM, Mark Lawrence  wrote:

On 04/05/2015 16:20, Cecil Westerhof wrote:


Potential dangerous bug introduced by programming in Python as if
it was C/Java. :-( I used: ++tries that has to be: tries += 1

Are there other things I have to be careful on? That does not work
as in C/Java, but is correct syntax.



Not dangerous at all, your test code picks it up. I'd also guess,
but don't actually know, that one of the various linter tools could
be configured to find this problem.


pylint reports it as an error.


I installed it. Get a lot of messages. Mostly convention. For example:
 Unnecessary parens after 'print' keyword


Sounds like it's configured for Python 2.x.  There's probably a setting 
to tell it to use Python3 rules.




And:
 Invalid variable name "f"
for:
 with open(real_file, 'r') as f:


Sounds like a bad wording.  Nothing invalid about it, though it is a bit 
short.  There are certain one letter variables which are so common as to 
be expected, but others should be avoided.





But still something to add to my toolbox.




--
DaveA
--
https://mail.python.org/mailman/listinfo/python-list


Re: Converting 5.223701009526849e-05 to 5e-05

2015-05-03 Thread Dave Angel

On 05/03/2015 05:22 AM, Cecil Westerhof wrote:

Op Sunday 3 May 2015 10:40 CEST schreef Ben Finney:


Cecil Westerhof  writes:


When I have a value like 5.223701009526849e-05 in most cases I am
not interested in all the digest after the dot.


What type of value is it?


If the absolute value is bigger as 0 and smaller as 1, it should be a
float. ;-)



or for many uses, more likely a decimal.Decimal(), where many of the 
problems Ben mentions are moot.



--
DaveA
--
https://mail.python.org/mailman/listinfo/python-list


Re: Custom alphabetical sort

2015-05-02 Thread Dave Angel

On 05/02/2015 11:35 AM, Pander Musubi wrote:

On Monday, 24 December 2012 16:32:56 UTC+1, Pander Musubi  wrote:

Hi all,

I would like to sort according to this order:

(' ', '.', '\'', '-', '0', '1', '2', '3', '4', '5', '6', '7', '8', '9', 'a', 
'A', 'ä', 'Ä', 'á', 'Á', 'â', 'Â', 'à', 'À', 'å', 'Å', 'b', 'B', 'c', 'C', 'ç', 
'Ç', 'd', 'D', 'e', 'E', 'ë', 'Ë', 'é', 'É', 'ê', 'Ê', 'è', 'È', 'f', 'F', 'g', 
'G', 'h', 'H', 'i', 'I', 'ï', 'Ï', 'í', 'Í', 'î', 'Î', 'ì', 'Ì', 'j', 'J', 'k', 
'K', 'l', 'L', 'm', 'M', 'n', 'ñ', 'N', 'Ñ', 'o', 'O', 'ö', 'Ö', 'ó', 'Ó', 'ô', 
'Ô', 'ò', 'Ò', 'ø', 'Ø', 'p', 'P', 'q', 'Q', 'r', 'R', 's', 'S', 't', 'T', 'u', 
'U', 'ü', 'Ü', 'ú', 'Ú', 'û', 'Û', 'ù', 'Ù', 'v', 'V', 'w', 'W', 'x', 'X', 'y', 
'Y', 'z', 'Z')

How can I do this? The default sorted() does not give the desired result.

Thanks,

Pander


Meanwhile Python 3 supports locale aware sorting, see 
https://docs.python.org/3/howto/sorting.html



You're aware that the message you're responding to is 16 months old?

And answered pretty thoroughly, starting with the fact that the OP's 
desired order didn't match any particular locale.




--
DaveA
--
https://mail.python.org/mailman/listinfo/python-list


Re: Python is not bad ;-)

2015-05-02 Thread Dave Angel

On 05/02/2015 05:33 AM, Cecil Westerhof wrote:

Please check your email settings.  Your messages that you type seem to 
be indented properly, but those that are quoting earlier messages (even 
your own) are not.  See below.  I suspect there's some problem with how 
your email program processes html messages.



Op Saturday 2 May 2015 10:26 CEST schreef Cecil Westerhof:


That is mostly because the tail recursion version starts multiplying
at the high end. I wrote a second version:
def factorial_tail_recursion2(x):
y = 1
z = 1
while True:
if x == z:
return y
y *= z
z += 1

This has almost the performance of the iterative version: 34 and 121
seconds.

So I made a new recursive version:
def factorial_recursive(x, y = 1, z = 1):
return y if x == z else factorial_recursive(x, x * y, z + 1)


Stupid me 'x == z' should be 'z > x'



I can't see how that is worth doing. The recursive version is already a 
distortion of the definition of factorial that I learned.  And to force 
it to be recursive and also contort it so it does the operations in the 
same order as the iterative version, just to gain performance?


If you want performance on factorial, write it iteratively, in as 
straightforward a way as possible.  Or just call the library function.


Recursion is a very useful tool in a developer's toolbox.  But the only 
reason I would use it for factorial is to provide a simple demonstration 
to introduce the concept to a beginner.



--
DaveA
--
https://mail.python.org/mailman/listinfo/python-list


Re: Python is not bad ;-)

2015-05-02 Thread Dave Angel

On 05/02/2015 05:58 AM, Marko Rauhamaa wrote:

Chris Angelico :


Guido is against anything that disrupts tracebacks, and optimizing
tail recursion while maintaining traceback integrity is rather harder.


Tail recursion could be suppressed during debugging. Optimized code can
play all kinds of non-obvious tricks with the execution frame.


In the situations where it really is simple, you can always make the
change in your own code anyway. Often, the process of converting
recursion into tail recursion warps the code to the point where it's
abusing recursion to implement iteration anyway, so just make it
iterative.


While you shouldn't actively replace Python iteration with recursion, I
strongly disagree that naturally occurring tail recursion is abuse or
should be avoided in any manner.



When you strongly disagree, make sure you're disagreeing with what Chris 
actually said.  The key phrase in his message was

"converting recursion into tail recursion"

NOT  "converting iteration into recursion"
and NOT "naturally occurring tail recursion"


--
DaveA
--
https://mail.python.org/mailman/listinfo/python-list


Re: Is my implementation of happy number OK

2015-04-30 Thread Dave Angel

On 04/30/2015 07:31 PM, Jon Ribbens wrote:

On 2015-04-30, Dave Angel  wrote:

Finally, I did some testing on Jon Ribben's version.  His was
substantially faster for smaller sets, and about the same for 10*7.  So
it's likely it'll be slower than yours and mine for 10**8.


You know what they say about assumptions. Actually, my version is six
times faster for 10**8 (running under Python 3.4).


But the real reason I didn't like it was it produced a much larger
set of happy_numbers, which could clog memory a lot sooner.  For
10**7 items, I had 3250 happy members, and 19630 unhappy.  And Jon
had 1418854 happy members.


Er, what? You're complaining that mine is less efficient by not
producing the wrong output?



It's not intended as a criticism;  you solved a different problem.  The 
problem Cecil was solving was to determine if a particular number is 
happy.  The problem you solved was to make a list of all values under a 
particular limit that are happy.


Both produce identical results for the Cecil purpose, and yours is 
faster if one wants all the values.  But if one wants a sampling of 
values, his function will fetch them quickly, and even if you want them 
all, his function will use much less memory.


He keeps only one permutation of each value in the set, for substantial 
savings in space.  For example, he might just keep 28, while you keep 28 
and 82, 208, 280, 802, and 820.



--
DaveA
--
https://mail.python.org/mailman/listinfo/python-list


Re: Rounding a number

2015-04-30 Thread Dave Angel

On 04/30/2015 06:35 PM, Seymore4Head wrote:

On Thu, 30 Apr 2015 22:00:17 +0200, Thijs Engels
 wrote:


round(65253, -3)

might be what you are looking for...


On Thu, Apr 30, 2015, at 21:49, Seymore4Head wrote:

I have this page book marked.
https://mkaz.com/2012/10/10/python-string-format/

I am getting numbers from sixty thousand to two hundred thousand.
I would like to round them to the nearest thousand.
So 65,253 should read 65,000.
How?

Total=2100
for x in range (10,35):
 count=1000/x
 print ("Foo {:7,.0f} Fighters".format(Total*count))
--
https://mail.python.org/mailman/listinfo/python-list


Thanks

I know there are more than one way to round and answer.  I was hoping
that using the {:7,.0f} formatting had a solution.



There are definite tradeoffs, but since you're rounding to integer size, 
the round() function works fine.  If you wanted tenths, you'd have to 
realize that a float (which is binary float) can't represent them 
exactly.  So depending on what further processing you do, you might see 
some effects that would not seem like it worked right.


Using the % or the .format style of formatting works fine, but it gives 
you a string.  You didn't specify that, so people probably assumed you 
wanted numbers.


--
DaveA
--
https://mail.python.org/mailman/listinfo/python-list


Re: Is my implementation of happy number OK

2015-04-30 Thread Dave Angel

On 04/30/2015 04:35 PM, Cecil Westerhof wrote:

Op Thursday 30 Apr 2015 20:53 CEST schreef Dave Angel:



Finally, I did some testing on Jon Ribben's version. His was
substantially faster for smaller sets, and about the same for 10*7.
So it's likely it'll be slower than yours and mine for 10**8. But
the real reason I didn't like it was it produced a much larger set
of happy_numbers, which could clog memory a lot sooner. For 10**7
items, I had 3250 happy members, and 19630 unhappy. And Jon had
1418854 happy members.


My version has 1625 and 9814. I do not understand the difference.



My error.  I had also written a version of the function that stored 
strings instead of ints, and the counts of 3250/19630 was for sets that 
had BOTH.


An exercise for the reader.  Notice that in my brute force algorithm I 
use no global sets.  But I do use an internal list, which is apparently 
unbounded.  So it's feasible to run out of memory.  The challenge is to 
write a similar function that uses no lists, sets, or dicts, just an 
algorithm that detects for an arbitrary sized number whether it's happy 
or not.  (It may not be very quick, but that's yet to be decided.  I'm 
already surprised that the present brute force function  davea1() only 
takes about twice as long as the fancy global caching schemes.


--
DaveA
--
https://mail.python.org/mailman/listinfo/python-list


Re: Lucky numbers in Python

2015-04-30 Thread Dave Angel

On 04/30/2015 02:55 PM, Cecil Westerhof wrote:

Because I want the code to work with Python 3 also, the code is now:
 def lucky_numbers(n):
 """
 Lucky numbers from 1 up-to n
 http://en.wikipedia.org/wiki/Lucky_number
 """

 if n < 3:
 return [1]
 sieve = list(range(1, n + 1, 2))
 sieve_index = 1
 while True:
 sieve_len   = len(sieve)
 if (sieve_index + 1) > sieve_len:
 break
 skip_count  = sieve[sieve_index]
 if sieve_len < skip_count:
 break
 del sieve[skip_count - 1 : : skip_count]
 sieve_index += 1
 return sieve

It looks like the list in:
 sieve = list(range(1, n + 1, 2))

does not have much influence in Python 2. So I was thinking of leaving
the code like it is. Or is it better to check and do the list only
with Python 3?



I'd do something like this at top of the module:

try:
range = xrange
except NameError as ex:
pass

then use range as it is defined in Python3.

if that makes you nervous, then define irange = xrange, and if it gets a 
NameError exception, irange = range



--
DaveA
--
https://mail.python.org/mailman/listinfo/python-list


Re: l = range(int(1E9))

2015-04-30 Thread Dave Angel

On 04/30/2015 02:48 PM, alister wrote:

On Thu, 30 Apr 2015 20:23:31 +0200, Gisle Vanem wrote:


Cecil Westerhof wrote:


If I execute:
  l = range(int(1E9)

The python process gobbles up all the memory and is killed. The problem
is that after this my swap is completely used, because other processes
have swapped to it. This make those programs more slowly. Is there a
way to circumvent Python claiming all the memory?

By the way: this is CPython 2.7.8.


On what OS? If I try something similar on Win-8.1 and CPython 2.7.5
(32-bit):

   python -c "for i in range(int(1E9)): pass"
Traceback (most recent call last):
  File "", line 1, in 
MemoryError


--gv


also MemoryError on Fedora 21 32 bit



That's presumably because you end up running out of address space before 
you run out of swap space.  On a 64 bit system the reverse will be true, 
unless you have a really *really* large swap file


ulimit is your friend if you've got a program that wants to gobble up 
all of swap space.


--
DaveA
--
https://mail.python.org/mailman/listinfo/python-list


Re: Is my implementation of happy number OK

2015-04-30 Thread Dave Angel

On 04/30/2015 11:59 AM, Cecil Westerhof wrote:

I implemented happy_number function:
 _happy_set  = { '1' }
 _unhappy_set= set()

 def happy_number(n):
 """
 Check if a number is a happy number
 https://en.wikipedia.org/wiki/Happy_number
 """

 def create_current(n):
 current_array = sorted([value for value in str(n) if value != '0'])
 return (current_array, ''.join(current_array))

 global _happy_set
 global _unhappy_set

 current_run = set()
 current_array, \
 current_string  = create_current(n)
 if current_string in _happy_set:
 return True
 if current_string in _unhappy_set:
 return False
 while True:
 current_run.add(current_string)
 current_array, \
 current_string = create_current(sum([int(value) ** 2
  for value in 
current_string]))
 if current_string in current_run | _unhappy_set:
 _unhappy_set |= current_run
 return False
 if current_string in _happy_set:
 _happy_set |= current_run
 return True

Besides it need some documentation: is it a good implementation? Or
are there things I should do differently?

To decide for the values from 1 to 1E8 if it is happy or not, takes
280 seconds. Not to bad I think. Also not very good.



First comment, if you're coding a complex implementation like this, take 
the time to do a brute force one as well. Then you can compare the 
results between brute force and your optimized one for all possible 
values, and make sure you haven't introduced any bugs.


My brute force looks like:

#Dave's version, brute force

def davea1(n):
cache = []
anum = str(n)
newnum = 0
while newnum != 1:
newnum = sum(int(i)*int(i) for i in anum)
anum = str(newnum)
if newnum in cache:
return False #not a happy number
cache.append(newnum)
return True  #found a happy number

I then tried an optimized one, and my speed is only about 10% faster 
than yours for 1e7 loops.  I show it anyway, since I think it reads a 
little better.  And readability counts much more than a little performance.


 #optimizations:
#   cached happy and unhappy sets
#   sort the digits, and compare only the sorted values, without zeroes
davea_happy = {1}
davea_unhappy = set()

SQUARES = dict((str(i), i*i) for i in xrange(10))

def davea2(n):
global davea_happy, davea_unhappy
cache = set()
newnum = n
while newnum != 1:
newnum = int("".join(sorted(str(newnum
if newnum in davea_unhappy or newnum in cache:
davea_unhappy |= cache
return False #not a happy number
if newnum in davea_happy:
break
cache.add(newnum)
newnum = sum(SQUARES[ch] for ch in str(newnum))
davea_happy |= cache
return True  #found a happy number

Finally, I did some testing on Jon Ribben's version.  His was 
substantially faster for smaller sets, and about the same for 10*7.  So 
it's likely it'll be slower than yours and mine for 10**8.  But the real 
reason I didn't like it was it produced a much larger set of 
happy_numbers, which could clog memory a lot sooner.  For 10**7 items, I 
had 3250 happy members, and 19630 unhappy.  And Jon had 1418854 happy 
members.


--
DaveA
--
https://mail.python.org/mailman/listinfo/python-list


Re: seek operation in python

2015-04-30 Thread Dave Angel

On 04/30/2015 04:06 AM, Cecil Westerhof wrote:

Op Thursday 30 Apr 2015 09:33 CEST schreef Chris Angelico:


On Thu, Apr 30, 2015 at 4:27 PM, Cecil Westerhof  wrote:

with open("input.cpp") as f:
lines = f.readlines()
print(lines[7])


Is the following not better:
print(open('input.cpp', 'r').readlines()[7])

Time is the same (about 25 seconds for 100.000 calls), but I find
this more clear.


The significant difference is that the 'with' block guarantees to
close the file promptly. With CPython it probably won't make a lot
of difference, and in a tiny script it won't do much either, but if
you do this on Jython or IronPython or MicroPython or some other
implementation, it may well make a gigantic difference - your loop
might actually fail because the file's still open.


I thought that in this case the file was also closed. But if that is
not the case I should think about this when I switch to another
version as CPython.

I wrote a module where I have:
 def get_indexed_message(message_filename, index):
 """
 Get index message from a file, where 0 gets the first message
 """

 return open(expanduser(message_filename), 
'r').readlines()[index].rstrip()

But this can be used by others also and they could be using Jython or
another implementation. So should I rewrite this and other functions?
Or would it be OK because the open is in a function?



No, it's not going to close the file just because the open is in a 
function.  The "with" construct was designed to help solve exactly this 
problem.  Please use it.




--
DaveA
--
https://mail.python.org/mailman/listinfo/python-list


Re: Let exception fire or return None

2015-04-30 Thread Dave Angel

On 04/30/2015 03:43 AM, Cecil Westerhof wrote:

I have a function to fetch a message from a file:
 def get_indexed_message(message_filename, index):
 """
 Get index message from a file, where 0 gets the first message
 """

 return open(expanduser(message_filename), 
'r').readlines()[index].rstrip()

What is more the Python way: let the exception fire like this code
when index is to big, or catching it and returning None?

I suppose working zero based is OK.



Fire an exception.

One advantage is that the exception will pinpoint exactly which line of 
the function had a problem.




--
DaveA
--
https://mail.python.org/mailman/listinfo/python-list


Re: Not possible to hide local variables

2015-04-29 Thread Dave Angel

On 04/29/2015 10:16 AM, Grant Edwards wrote:

On 2015-04-28, Cecil Westerhof  wrote:

If I remember correctly you can not hide variables of a class or make
them read-only?

I want to rewrite my moving average to python. The init is:
 def __init__(self, length):
 if type(length) != int:
 raise ParameterError, 'Parameter has to be an int'
 if n < 0:
 raise ValueError, 'Parameter should be greater or equal 2'
 self.length = length
 self.old_values = []
 self.current_total  = 0

But when someone changes length, old_values, or current_total that
would wreck havoc with my class instance. What is the best way to
handle this?


It's like the punchline to the old doctor joke: if it hurts when you
do that, then don't _do_ that:

   def __init__(self, length):
   if type(length) != int:


Better:  if isinstance(length, int):


   raise ParameterError, 'Parameter has to be an int'
   if n < 0:


Better:   if length < 0:  (since n is undefined)


   raise ValueError, 'Parameter should be greater or equal 2'
   self._length = length
   self._old_values = []
   self._current_total  = 0

The convention is that properties that start with underscores are
private.  They're not hidden, but if people touch them and it breaks
something, it's their fault.  Whether you go all Torvalds on their ass
for doing so is left as an exercise for the reader.




--
DaveA
--
https://mail.python.org/mailman/listinfo/python-list


Re: implicitly concats of adjacent strings does not work with format

2015-04-29 Thread Dave Angel

On 04/29/2015 08:42 AM, Cecil Westerhof wrote:

I have the folowing print statements:
 print(
 'Calculating fibonacci_old, fibonacci_memoize and '
 'fibonacci_memoize once for {0} '.format(large_fibonacci))


 print(
 'Calculating fibonacci_old, fibonacci_memoize and '
 'fibonacci_memoize once for {0} '.format(large_fibonacci) +
 'to determine speed increase')

 print(
 'Calculating fibonacci_old, fibonacci_memoize and '
 'to determine speed increase'
 'fibonacci_memoize once for {0} '.format(large_fibonacci))


 print(
 'Calculating fibonacci_old, fibonacci_memoize and '
 'fibonacci_memoize once for {0} '.format(large_fibonacci)
 'to determine speed increase')

The first three work, but the last gives:
 'to determine speed increase')
 ^
 SyntaxError: invalid syntax

Not very important, because I can use the second one, but I was just
wondering why it goes wrong.



Adjacent string literals are concatenated.  But once you've called a 
method (.format()) on that literal, you now have an expression, not a 
string literal.


You could either change the last line to

 + 'to determine speed increase')

or you could concatenate all the strings before calling the format method:


print(
'Calculating fibonacci_old, fibonacci_memoize and '
'fibonacci_memoize once for {0} '
'to determine speed increase' .format(large_fibonacci))

Something you may not realize is that the addjacent-concatenation occurs 
at compile time, so your third example could be transformed from:


print(
'Calculating fibonacci_old, fibonacci_memoize and '
'to determine speed increase'
'fibonacci_memoize once for {0} '.format(large_fibonacci))

to:
print(
'Calculating fibonacci_old, fibonacci_memoize and '
'to determine speed increase'
'fibonacci_memoize once for {0}'
' '.format(large_fibonacci))


All three literals are combined before format() is called.  Knowing this 
could be vital if you had {} elsewhere in the 9single) literal.


--
DaveA
--
https://mail.python.org/mailman/listinfo/python-list


Re: Panda data read_csv returns 'TextFileReader' object

2015-04-24 Thread Dave Angel

On 04/24/2015 04:04 PM, Kurt wrote:

Isn't the call pd.read_csv(filepath,...) suppose to return a dataframe, not 
this other object? I keep getting the following error when I try to view the 
attribute head.

AttributeError: 'TextFileReader' object has no attribute 'head'

I reference pandas as pd and df is suppose to define itself. I had this working 
at one time.



Please supply the Python version, the pandas version, and the code you 
ran.  It can also be useful to show how/where you installed pandas, like 
the URL you used, or the apt-get, or the pip command, or whatever.


Then show the complete error traceback, not just a summary.  And if the 
error really is on the line you partially supplied above, what's the 
type and contents of filepath?  What are the other arguments' values?


The function I suspect you're trying to call is:

pandas.io.parsers.read_csv(filepath, ...)

but I can't tell from your description.

--
DaveA
--
https://mail.python.org/mailman/listinfo/python-list


Re: can you help guys?

2015-04-24 Thread Dave Angel

On 04/24/2015 04:29 AM, brokolists wrote:

24 Nisan 2015 Cuma 02:20:12 UTC+3 tarihinde Steven D'Aprano yazdı:

On Fri, 24 Apr 2015 01:51 am, brokolists wrote:


my problem is i m working with very long float numbers and i use
numberx =float(( input( 'enter the number\n ')))
after i use this command when i enter something more than 10 or 11 digits
it uses like 1e+10 or something like that but i have to calculate it
without scientific 'e' type. what can i do to solve this? (sorry for my
bad english)







i fixed the "e" problem by using digits

getcontext().prec=20
it works fine but in my algoritm it was checking the answers by their lenght so 
this created a new problem by giving all the answers with 20 digits. I might 
fix this but i have searched for 4 hours just how to get rid of 'e' . i think 
there must be a easier way to do this.



We could be a lot more helpful if you just spelled it out.  digits and 
getcontext() are not builtins, so we don't really know how you're using 
them, except by guessing.


You've changed the code, probably by using one of the many suggestions 
here.  But what's it look like now?


You should have started the thread by telling us the Python version. 
I'm guessing you're using python 3.x



In Python 3.x, input() returns a string.  So you can tell how much the 
user typed by doing a len() on that string.  Once you convert it into 
some other form, you're choosing the number of digits partly by how you 
convert it.


Can you state what your real assignment is?  And what code you have so 
far?  And just what is wrong with it, that you need to change?


For example, if the "correct" answer for your user is "7.3"  and the 
user types  "7.300" is that supposed to be right, or wrong?





--
DaveA
--
https://mail.python.org/mailman/listinfo/python-list


Re: A question on the creation of list of lists

2015-04-23 Thread Dave Angel

On 04/23/2015 08:36 AM, Gregory Ewing wrote:

Jean-Michel Pichavant wrote:

From: "subhabrata banerji" 

list_of_files = glob.glob('C:\Python27\*.*')

 >

1/ Your file pattern search will not get files that do not have any
dot in
their name


Actually, on Windows, it will. (This is for compatibility with
MS-DOS 8.3 filenames, where the dot wasn't actually stored, so
there was no distinction between a filename with a dot and an
empty extension, and a filename with no dot.)



That's certainly true at the command line in Windows DOS box, but I'd be 
astounded if glob used the same logic, even on Winodows.


--
DaveA
--
https://mail.python.org/mailman/listinfo/python-list


Re: May I drop list bracket from list?

2015-04-23 Thread Dave Angel

On 04/23/2015 06:11 AM, subhabrata.bane...@gmail.com wrote:

Dear Group,

I am trying to read a list of files as
list_of_files = glob.glob('C:\Python27\*.*')
Now I am trying to read each one of them,
convert into list of words, and append to a list
as.

list1=[]
for file in list_of_files:
   print file
   fread1=open(file,"r").read()
   fword=fread1.split()
   list1.append(fword)

Here the list is a list of lists, but I want only one list not
list of lists.

I was thinking of stripping it as, str(list1).strip('[]')

but in that case it would be converted to string.

Is there a way to do it. I am using Python27 on Windows7 Professional.
Apology for an indentation error.

If anybody may please suggest.



You're first problem is the name of your variable.  fword implies it's a 
string, but it's really a list.  So when you do:

 list1.append(fword)

you're appending a list to a list, which gives you nested lists.  Sounds 
like you want

 list1.extend(fword)



--
DaveA
--
https://mail.python.org/mailman/listinfo/python-list


Re: Diff between object graphs?

2015-04-22 Thread Dave Angel

On 04/22/2015 09:46 PM, Chris Angelico wrote:

On Thu, Apr 23, 2015 at 11:37 AM, Dave Angel  wrote:

On 04/22/2015 09:30 PM, Cem Karan wrote:



On Apr 22, 2015, at 8:53 AM, Peter Otten <__pete...@web.de> wrote:


Another slightly more involved idea:

Make the events pickleable, and save the simulator only for every 100th
(for
example) event. To restore the 7531th state load pickle 7500 and apply
events 7501 to 7531.



I was hoping to avoid doing this as I lose information.  BUT, its likely
that this will be the best approach regardless of what other methods I use;
there is just too much data.



Why would that lose any information???


It loses information if event processing isn't perfectly deterministic.


Quite right.  But I hadn't seen anything in this thread to imply that.

I used an approach like that on the Game of Life, in 1976.  I saved 
every 10th or so state, and was able to run the simulation backwards by 
going forward from the previous saved state.  In this case, the analogue 
of the "event" is determined from the previous state.  But it's quite 
similar, and quite deterministic.




--
DaveA
--
https://mail.python.org/mailman/listinfo/python-list


Re: Diff between object graphs?

2015-04-22 Thread Dave Angel

On 04/22/2015 09:30 PM, Cem Karan wrote:


On Apr 22, 2015, at 8:53 AM, Peter Otten <__pete...@web.de> wrote:


Another slightly more involved idea:

Make the events pickleable, and save the simulator only for every 100th (for
example) event. To restore the 7531th state load pickle 7500 and apply
events 7501 to 7531.


I was hoping to avoid doing this as I lose information.  BUT, its likely that 
this will be the best approach regardless of what other methods I use; there is 
just too much data.



Why would that lose any information???


--
DaveA
--
https://mail.python.org/mailman/listinfo/python-list


Re: multiprocessing module and matplotlib.pyplot/PdfPages

2015-04-21 Thread Dave Angel

On 04/21/2015 07:54 PM, Dennis Lee Bieber wrote:

On Tue, 21 Apr 2015 18:12:53 +0100, Paulo da Silva
 declaimed the following:




Yes. fork will do that. I have just looked at it and it is the same as
unix fork (module os). I am thinking of launching several forks that
will produce .png images and at the end I'll call "convert" program to
place those .png files into a pdf book. A poor solution but much faster.



To the best of my knowledge, on a UNIX-type OS, multiprocessing /uses/
fork() already. Windows does not have the equivalent of fork(), so
multiprocessing uses a different method to create the process
(conceptually, it runs a program that does an import of the module followed
by a call to the named method -- which is why one must follow the

if __name__  ...

bit prevent the subprocess import from repeating the original main program.



The page:

https://docs.python.org/3/library/multiprocessing.html#contexts-and-start-methods

indicates that there are 3 ways in which a new process may be started. 
On Unix you may use any of the three, while on Windows, you're stuck 
with spawn.


I *think* that in Unix, it always does a fork.  But if you specify 
"spawn" in Unix, you get all the extra overhead to wind up with what 
you're describing above.  If you know your code will run only on Unix, 
you presumably can get much more efficiency by using the fork 
start-method explicitly.


I haven't done it, but it would seem likely to me that forked code can 
continue to use existing global variables.  Changes to those variables 
would not be shared across the two forked processes.  But if this is 
true, it would seem to be much easier way to launch the second process, 
if it's going to be nearly identical to the first.


Maybe this is just describing the os.fork() function.
   https://docs.python.org/3.4/library/os.html#os.fork

--
DaveA
--
https://mail.python.org/mailman/listinfo/python-list


Re: multiprocessing module and matplotlib.pyplot/PdfPages

2015-04-21 Thread Dave Angel

On 04/20/2015 10:14 PM, Paulo da Silva wrote:

I have program that generates about 100 relatively complex graphics and
writes then to a pdf book.
It takes a while!
Is there any possibility of using multiprocessing to build the graphics
and then use several calls to savefig(), i.e. some kind of graphic's
objects?



To know if this is practical, we have to guess about the code you 
describe, and about the machine you're using.


First, if you don't have multiple cores on  your machine, then it's 
probably not going to be any faster, and might be substantially slower. 
 Ditto if the code is so large that multiple copies of it will cause 
swapping.


But if you have 4 cores, and a processor=bound algorithm, it can indeed 
save time to run 3 or 4 processes in parallel.  You'll have to write 
code to parcel out the parts that can be done in parallel, and make a 
queue that each process can grab its next assignment from.


There are other gotchas, such as common code that has to be run before 
any of the subprocesses.  If you discover that each of these 100 pieces 
has to access data from earlier pieces, then you could get bogged down 
in communications and coordination.


If the 100 plots are really quite independent, you could also consider 
recruiting time from multiple machines.  As long as the data that needs 
to go between them is not too large, it can pay off big time.


--
DaveA
--
https://mail.python.org/mailman/listinfo/python-list


Re: Opening Multiple files at one time

2015-04-21 Thread Dave Angel

On 04/21/2015 03:56 AM, subhabrata.bane...@gmail.com wrote:



Yes. They do not. They are opening one by one.
I have some big chunk of data I am getting by crawling etc.
now as I run the code it is fetching data.
I am trying to fetch the data from various sites.
The contents of the file are getting getting stored in
separate files.
For example, if I open the site of "http://www.theguardian.com/international";, then the result may be stored 
in file in file "file1.txt", and the contents of site "http://edition.cnn.com/";, may be stored in 
file "file2.txt".

But the contents of each site changes everyday. So everyday as you open these 
sites and store the results, you should store in different text files. These 
text files I am looking to be created on its own, as you do not know its 
numbers, how many numbers you need to fetch the data.

I am trying to do some results with import datetime as datetime.datetime.now()
may change everytime. I am trying to put it as name of file. But you may 
suggest better.



To get the text version of today's date, use something like:

import datetime
import itertools
SUFFIX = datetime.datetime.now().strftime("%Y$%m%d")

To write a filename generator that generates names sequentially  (untested):

def filenames(suffix=SUFFIX):
for integer in itertools.count(1):
yield "{0:04d}".format(integer) + "-" + SUFFIX


for filename in filenames():
f = open (filename, "w")
...Do some work here which writes to the file, and does "break"
... if we don't need any more files
f.close()

Note that this is literally open-ended.  If you don't put some logic in 
that loop which will break, it'll write files till your OS stops you, 
either because of disk full, or directory too large, or whatever.


I suggest you test the loop out by using a print() before using it to 
actually create the files.


In the format above, I used 4 digits, on the assumption that usually 
that is enough.  If you need more than that on the occasional day, it 
won't break, but the names won't be nicely sorted when you view them.



If this were my problem, I'd also use a generator for the web page 
names.  If you write that generator, then you could do something like 
(untested):


for filename, url in zip(filenames(), urlames():
f = open (filename, "w")
... process the url, writing to file f
f.close()

In this loop, it'll automatically end when you run out of urlnames.


I also think you should consider making the date the directory name you 
use, rather than putting many days files in a single directory.  But 
this mainly affects the way you concatenate the parts together.  You'd 
use os.path.join() rather than  "+" to combine parts.



--
DaveA
--
https://mail.python.org/mailman/listinfo/python-list


Re: Opening Multiple files at one time

2015-04-20 Thread Dave Angel

On 04/20/2015 07:59 AM, subhabrata.bane...@gmail.com wrote:

Dear Group,

I am trying to open multiple files at one time.
I am trying to do it as,

  for item in  [ "one", "two", "three" ]:
f = open (item + "world.txt", "w")
f.close()

This is fine.


But it does not open multiple files at one time.  Perhaps you'd better 
explain better what you mean by "at one time."



But I was looking if I do not know the number of
text files I would create beforehand,


So instead of using a literal list [ "one", "two... ]
you construct one, or read it in from disk, or use sys.argv.  What's the 
difficulty?  Nothing in the code fragment you show cares how long the 
list is.



 so not trying xrange option

also.


No idea what "the xrange option" means.



And if in every run of the code if the name of the text files have
to created on its own.


So figure out what you're going to use for those names, and write the 
code to generate them.




Is there a solution for this?


For what?


--
DaveA
--
https://mail.python.org/mailman/listinfo/python-list


Re: Python and lotus notes

2015-04-20 Thread Dave Angel

On 04/20/2015 04:29 AM, gianluca.pu...@gmail.com wrote:

Hi,



Hi and welcome.
I don't know Lotus Notes, but i can at least comment on some of your 
code, pointing out at least some problems.



i am having a problem when i try to access lotus notes with python, think i do 
all ok but it seems something is going wrong because i can't print any db title 
even if i've opened the .nsf file.

My code:

import win32com.client
from win32com.client import Dispatch

notesServer='Mail\wed\new\IT'


You're using backslashes in literal strings when there's nothing escaped 
there.  So you need to either double them or use a raw string.  It's 
also possible that you could use forward slashes, but I can't be sure.


I'd simply use
notesServer = R'Mail\wed\new\IT'



notesFile= 'Apps\Mydb.nsf'
notesPass= 'mypass'

session = Dispatch('Lotus.NotesSession')
session.Initialize(notesPass)
print ""

db = session.getDatabase(notesServer, notesFile)

print db.Open


You forgot the parentheses.  So you're not calling the Open function, 
you're just printing it.



print db.IsOpen
print db.Title

but i receve:

>>
False

Please can u help me?
GL



There may be other things, but if you fix the three lines, it'll 
probably get further.


--
DaveA
--
https://mail.python.org/mailman/listinfo/python-list


Re: do you guys help newbies??

2015-04-19 Thread Dave Angel

On 04/19/2015 09:37 PM, josiah.l...@stu.nebo.edu wrote:

On Wednesday, November 27, 2002 at 4:01:02 AM UTC-7, John Hunter wrote:

"malik" == malik martin  writes:


 malik>i'm having a simple problem i guess.  but i still dont
 malik> know the answer. anyone see anything wrong with this code?
 malik> i think it's the line in the while loop that starts with
 malik> "averagea = ..."  because it will execute once but once the
 malik> loop goes around one more time it gets an error. is there
 malik> it because averagea has been assigned a definate value? if
 malik> so how do i record the value but reset averagea back to 0(
 malik> i guess it would be 0)

 malik>any help would be apreciated. thanks!!

I think you forgot to post your code.   Makes it harder.

JDH




And I think you forgot to check the date of the post to which you're 
responding.


--
DaveA
--
https://mail.python.org/mailman/listinfo/python-list


Re: Python and fortran Interface suggestion

2015-04-19 Thread Dave Angel

On 04/19/2015 11:56 AM, pauld11718 wrote:

I shall provide with further details

Its about Mathematical modelling of a system. Fortran does one part and python 
does the other part (which I am suppose to provide).
For a time interval tn --> t_n+1, fortran code generates some values, for which my 
python code accepts it as an input. It integrates some data for the same time step tn 
--> tn+1 and fortran computes based on this for the next time step t_n+1 --> 
t_n+2..and this is how loop continues.

Its the fortran code calling my Python executable at all the time interval.


Then you'd better find out how it's calling your executable.  Calling it 
is very different than starting it.  The whole import x, y,z is done 
only upon initial startup of the python code.  You can then call the 
Python code as many times as you like without incurring that overhead again.


Naturally, if the guy who designed the Fortran code didn't think the 
same way, and is unavailable for further tweaking, you've got a problem.




So,
1. Is it about rebuilding every thing using cx_freeze kind of stuff in windows?


Don't worry about how you get installed until after you figure out how 
you're going to get those calls and data back and forth between the 
Fortran code and your own.




2. As interfacing between fortran/python is via .dat file, so on every 
time-step python executable is called it shall open()/close() .dat file. This 
along with all those
from  import *'s
are unnecessary.


Have you got specs on the whole dat file thing?  How will you know it's 
time to examine the file again?  As long as you get notified through 
some other mechanism, there's no problem in both programs having an open 
handle on the shared file.


For that matter, the notification can be implicit in the file as well.

But is there a designer behind this, or is the program intending to deal 
with something else and you're just trying to sneak in the back door?





3. I dont have access to fortran code. I shall provide the python executable 
only, which will take input from .dat file and write to another .dat file. So, 
what is your suggestion regarding socket/pipe and python installation on RAM? I 
have no idea with those.



Not much point in opening a pipe if the other end isn't going to be 
opened by the Fortran code.



Isn't there a better way to handle such issues?



Sure, there are lots of ways.  But if the Fortran code is a closed book, 
you'll have to find out how it was designed, and do all the adapting at 
your end.


If it becomes open again, then you have the choice between having one of 
your languages call functions in the other (ie. single process), having 
some mechanism (like queues, or sockets) between two separately started 
processes, and having one program launch the other repeatedly.  That 
last is probably the least performant choice.


If you google "python Fortran" you'll get lots of hits.  You could start 
reading to see what your choices are.  But if the Fortran designer has 
left the room, you'll be stuck with whatever he put together.


And chances are very high that you'll need to develop, and certainly 
test, some parts of the project on the Windows machine, and particularly 
with the Windows C/Fortran compilers.



--
DaveA
--
https://mail.python.org/mailman/listinfo/python-list


Re: Best search algorithm to find condition within a range

2015-04-19 Thread Dave Angel

On 04/19/2015 09:02 AM, Steven D'Aprano wrote:

On Sun, 19 Apr 2015 04:08 am, Albert van der Horst wrote:


Fire up a lowlevel interpreter like Forth. (e.g. gforth)


Yay! I'm not the only one who uses or likes Forth!

Have you tried Factor? I'm wondering if it is worth looking at, as a more
modern and less low-level version of Forth.



I also like Forth (since 83), but haven't done much in the last decade.

I was newsletter editor for our local FIG for many years.

I have met and debated with Elizabeth Rather, and been a "third hand" 
for Chuck Moore when he was re-soldering wires on his prototype Forth board.


You can see my name in the X3J14 standard:
   https://www.taygeta.com/forth/dpans1.htm#x3j14.membership


I'd be interested in a "more modern" Forth, but naturally, as a member 
of band of rugged individualists, I wonder if it can possibly satisfy 
more than one of us.



http://factorcode.org/

That site is my first time I recall seeing "concatenative" as a type of 
language.  Interesting way of thinking of it.  I just call it RPN, and 
relate it to the original HP35 calculator ($400, in about 1972).


From the overview, it looks like they're at least aiming at what I 
envisioned as the next Forth I wanted to use.  Instead of putting ints 
and addresses on the stack, you put refs to objects, in the Python 
sense.  Those objects are also gc'ed.  I don't know yet whether 
everything is an object, or whether (like Java), you have boxed and 
unboxed thingies.


Sounds interesting, and well worth investigating.  thanks for pointing 
it out.


--
DaveA
--
https://mail.python.org/mailman/listinfo/python-list


Re: New to Python - block grouping (spaces)

2015-04-19 Thread Dave Angel

On 04/19/2015 07:38 AM, BartC wrote:


Perhaps you don't understand what I'm getting at.

Suppose there were just two syntaxes: C-like and Python-like (we'll put
aside for a minute the question of what format is used to store Python
source code).

Why shouldn't A configure his editor to display a Python program in
C-like syntax, and B configure their editor to use Python-like tabbed
syntax?

A can write code in the preferred syntax, and B can view/modify exactly
the same code in /their/ preferred syntax. What would be the problem?
(The actual stored representation of the program would be in one of
those two styles, or something else entirely; Lisp-like syntax for
example. It doesn't matter because no-one would care.

(I think much of the problem that most languages are intimately
associated with their specific syntax, so that people can't see past it
to what the code is actually saying. a=b, a:=b, b=>a, (setf a b),
whatever the syntax is, who cares? We just want to do an assignment!)



If you make enough simplifying assumptions, of course it's easy and natural.

Assume that a text editor is the only way you'll be viewing the source 
code.  You and your coworkers are willing to each use a prescribed text 
editor, rather than your previous favorite one that doesn't happen to 
support the customization you're suggesting here.  You're not going to 
use a version control system, nor diff programs.  You're not going to 
post your code in messages on the internet.


You're not going to display error messages showing source code, from a 
running system.  You're not going to use the interactive debugger.


You're not going to copy code from one place to another without copying 
sufficient context to be able to reconstruct the funny syntax.


You're going to permit only one such variation, and it's just big enough 
that it's always obvious which version of the code is being examined (by 
programs or by humans) [1]


You're not going to use eval()  [good].  You're not going to examine 
source code that comes from 3rd party or the standard library.


The changes you want are all completely reversible, regardless of 
interspersed comments, and when reversed, preserve spacing and 
characters completely in the way that each user expects to see the code. 
 And they are reversible even if you only have a small fragment of the 
code.


I implemented something analogous to such a system 40 years ago.  But on 
that system, nearly all of the restrictions above applied.  There was no 
clipboard, no version control, no internet.  Programs had to fit in 64k. 
Each line stood completely on its own, so context was not a problem.  i 
wrote the text editor, much of the interactive debugger, the listing 
utility, the pretty printer, the cross reference program, etc.  So they 
were consistent.  Further, we had a captive audience -- if they didn't 
like it, they could buy the competitor's product, which had similar 
constraints.  Would I do things differently now?  You bet I would.  At 
least if I could upgrade the memory addressability to a few megs.



For the purposes you've described so far, I'd suggest just writing two 
translators, one a mirror image of the other.  Invoke them automatically 
on load and save in your personal editor  And learn to live with both 
syntaxes, because it will leak.  And if it's really not reversible, 
don't use them when you're editing somebody else's code.  And even if it 
is reversible, don't use them when you edit code you plan to post on the 
internet.


[1] So for example  a=b can not mean assignment in one version, and a 
comparison in the other.  Because then you'd need to either discover 
that it's a line by itself, or within an if expression, or ...  But you 
could arrange two disjoint sets of symbols readily enough.  In your 
version, require either := or ==, and make = just plain illegal.


PS.  I haven't examined the costs in tool development to support this. 
Just whether it's practical.  You can easily discount any one of these 
constraints, but taken as a whole, they very much limit what you can do.


--
DaveA
--
https://mail.python.org/mailman/listinfo/python-list


Re: HELP! How to return the returned value from a threaded function

2015-04-18 Thread Dave Angel

On 04/18/2015 01:07 PM, D. Xenakis wrote:

Maybe this is pretty simple but seems I am stuck...

def message_function():
 return "HelloWorld!"


def thread_maker():
 """
 call message_function()
 using a new thread
 and return it's "HelloWorld!"
 """
 pass


Could someone please complete above script so that:

thread_maker() == "HelloWorld!"

Please import the library you suggest too.
I tried threading but I could not make it work.

THX!



The first question is why you are even starting extra threads.  They are 
only useful if they do concurrent work.  So if you're starting a thread, 
then not doing anything yourself till the thread terminates, you've 
gained absolutely nothing.  And greatly complicated your code besides.


In fact, even if you start multiple threads, and/or continue doing your 
own work in the main thread, you might not gain anything in CPython, 
because of the GIL.


But if I assume you've justified the use of threads, or your 
boss/instructor has mandated them, then you still have to decide what 
the flow of logic is going to mean.


If you're going to start a bunch of threads, wait a while, and do 
something with the "return values" of some subset of them, then all you 
need is a way to tell whether those threads have yet posted their 
results.  If there's exactly one result per thread, then all you need is 
a global structure with room for all the results, and with an initial 
value that's recognizably different.


So fill a list with None elements, and tell each thread what element of 
the list to update.  Then in your main thread, you trust any list 
element that's no longer equal to None.


Alternatively, you might decide your main program will wait till all the 
threads have terminated.  In that case, instead of checking for None, 
simply do a join on all the threads before using the results.


But many real threaded programs use much more complex interactions, and 
they reuse threads rather than expecting them to terminate as soon as 
one result is available.  So there might not be a single "result" per 
thread, and things get much more complex.  This is where you would 
normally start studying queues.



--
DaveA
--
https://mail.python.org/mailman/listinfo/python-list


Re: Failed to import a "pyd: File When python intepreter embed in C++ project

2015-04-17 Thread Dave Angel

On 04/17/2015 01:17 PM, saadaouijihed1...@gmail.com wrote:

I have a swig module (.pyd).I followed the steps but it doesn't work
please help me.



First, unplug the computer and remove the battery.  Then if it's still 
burning, douse it with a fire extinguisher.


If your symptoms are different, you'll have to spell it out.  Like 
Python version, OS, complete stack trace.  And anything else that may be 
needed to diagnose the problem.


--
DaveA
--
https://mail.python.org/mailman/listinfo/python-list


Re: Converting text file to different encoding.

2015-04-17 Thread Dave Angel

On 04/17/2015 10:48 AM, Dave Angel wrote:

On 04/17/2015 09:19 AM, subhabrata.bane...@gmail.com wrote:



>>> target = open("target", "w")


It's not usually a good idea to use the same variable for both the file
name and the opened file object.  What if you need later to print the
name, as in an error message?


Oops, my error.  Somehow my brain didn't notice the quote marks, until I 
reread my own message online.




--
DaveA
--
https://mail.python.org/mailman/listinfo/python-list


Re: Converting text file to different encoding.

2015-04-17 Thread Dave Angel

On 04/17/2015 09:19 AM, subhabrata.bane...@gmail.com wrote:

I am having few files in default encoding. I wanted to change their encodings,
preferably in "UTF-8", or may be from one encoding to any other encoding.



You neglected to specify what Python version this is for.  Other 
information that'd be useful is whether the file size is small enough 
that two copies of it will all fit reasonably into memory.


I'll assume it's version 2.7, because of various clues in your sample 
code.  But if it's version 3.x, it could be substantially easier.



I was trying it as follows,

>>> import codecs
>>> sourceEncoding = "iso-8859-1"
>>> targetEncoding = "utf-8"
>>> source = open("source1","w")


mode "w" will truncate the source1 file, leaving you nothing to process. 
 i'd suggest "r"



>>> target = open("target", "w")


It's not usually a good idea to use the same variable for both the file 
name and the opened file object.  What if you need later to print the 
name, as in an error message?



>>> target.write(unicode(source, sourceEncoding).encode(targetEncoding))


I'd not recommend trying to do so much in one line, at least until you 
understand all the pieces.  Programming is not (usually) a contest to 
write the most obscure code, but rather to make a program you can still 
read and understand six months from now.  And, oh yeah, something that 
will run and accomplish something.


>
> but it was giving me error as follows,
> Traceback (most recent call last):
>File "", line 1, in 
>  target.write(unicode(source, sourceEncoding).encode(targetEncoding))
> TypeError: coercing to Unicode: need string or buffer, file found


if you factor this you will discover your error.  Nowhere do you read 
the source file into a byte string.  And that's what is needed for the 
unicode constructor.  Factored, you might have something like:


 encodedtext = source.read()
 text = unicode(source, sourceEncoding)
 reencodedtext = text.encode(targetEncoding)
 target.write(encodedText)

Next, you need to close the files.

source.close()
target.close()

There are a number of ways to improve that code, but this is a start.

Improvements:

 Use codecs.open() to open the files, so encoding is handled 
implicitly in the file objects.


 Use with... syntax so that the file closes are implicit

 read and write the files in a loop, a line at a time, so that you 
needn't have all the data in memory (at least twice) at one time.  This 
will also help enormously if you encounter any errors, and want to 
report the location and problem to the user.  It might even turn out to 
be faster.


 You should write non-trivial code in a text file, and run it from 
there.


--
DaveA
--
https://mail.python.org/mailman/listinfo/python-list


Re: ctypes: using .value or .value() doesn't work for c_long

2015-04-15 Thread Dave Angel

On 04/15/2015 03:48 PM, IronManMark20 wrote:

I am using ctypes to call a few windll funcions. One of them returns a c_long 
object. I want to know what number the function returns.

Problem is, when I try foo.value , it gives me this:

AttributeError: LP_c_long object has no attribute value.

Any idea of what could cause this?



I don't use Windows, so I can't try it, especially since you didn't 
include any code .  And apparently whatever function you're using 
returns a pointer to a c_long, not a c_long.


I see Ian has given you an answer.  But let me show you how you might 
have come up with it on your own, for next time.


The fragment of the error you include says that the instance of the 
class LP_c_long doesn't have an attribute called "value".  But it 
undoubtedly has other attributes, so you could look with dir


print(dir(foo))

Presumably from there you'll see that you need the 'contents' attribute. 
 Then you print that out, and/or dir a dir() on it, and you see that 
you now have a value attribute.



https://docs.python.org/3/library/ctypes.html

--
DaveA
--
https://mail.python.org/mailman/listinfo/python-list


Re: Is there functions like the luaL_loadfile and luaL_loadbuffer in lua source to dump the lua scripts in Python's source?

2015-04-14 Thread Dave Angel

On 04/14/2015 08:07 AM, zhihao chen wrote:

HI,I  want to dump the python script when some application(android's app)
use the python engine.they embedding python into its app.


  I can dump the script from an app which use the lua engine through the
luaL_loadbuffer or luaL_loadfile (just hook this function,and dump the lua
script by the (char *)buffer and size_t )

And,I want to know that:

Is there functions like the luaL_loadfile or luaL_loadbuffer on python's
source to read the python's file,so I cann't through this function to get
the (char *)buffer and size so to get the *.py or *.pyc ?

In other words,which C/C++ function contribute to load the all python's
script or pyc,I want to dump the scripts in that.



I know nothing about lua, but the __file__ attribute on most user 
modules will tell you the filename it's loaded from.  Naturally, that 
assumes it actually came from a file, and a few other things.



--
DaveA
--
https://mail.python.org/mailman/listinfo/python-list


Re: Pickle based workflow - looking for advice

2015-04-13 Thread Dave Angel

On 04/13/2015 10:58 AM, Fabien wrote:

Folks,



A comment.  Pickle is a method of creating persistent data, most 
commonly used to preserve data between runs.  A database is another 
method.  Although either one can also be used with multiprocessing, you 
seem to be worrying more about the mechanism, and not enough about the 
problem.



I am writing a quite extensive piece of scientific software. Its
workflow is quite easy to explain. The tool realizes series of
operations on watersheds (such as mapping data on it, geostatistics and
more). There are thousands of independent watersheds of different size,
and the size determines the computing time spent on each of them.


First question:  what is the name or "identity" of a watershed? 
Apparently it's named by a directory.  But you mention ID as well.  You 
write a function A() that takes only a directory name. Is that the name 
of the watershed?  One per directory?  And you can derive the ID from 
the directory name?


Second question, is there any communication between watersheds, or are 
they totally independent?


Third:  this "external data", is it dynamic, do you have to fetch it in 
a particular order, is it separated by watershed id, or what?


Fourth:  when the program starts, are the directories all empty, so the 
presence of a pickle file tells you that A() has run?  Or is there some 
other meaning for those files?




Say I have the operations A, B, C and D. B and C are completely
independent but they need A to be run first, D needs B and C, and so
forth. Eventually the whole operations A, B, C and D will run once for
all,


For all what?


but of course the whole development is an iterative process and I
rerun all operations many times.


Based on what?  Is the external data changing, and you have to rerun 
functions to update what you've already stored about them?  Or do you 
just mean you call the A() function on every possible watershed?




(I suddenly have to go out, so I can't comment on the rest, except that 
choosing to pickle, or to marshall, or to database, or to 
custom-serialize seems a bit premature.  You may have it all clear in 
your head, but I can't see what the interplay between all these calls to 
one-letter-named functions is intended to be.)



--
DaveA
--
https://mail.python.org/mailman/listinfo/python-list


Re: Excluding a few pawns from the game

2015-04-13 Thread Dave Angel

On 04/13/2015 07:30 AM, userque...@gmail.com wrote:

I am writing a function in python, where the function excludes a list of pawns 
from the game. The condition for excluding the pawns is whether the pawn is 
listed in the database DBPawnBoardChart. Here is my code:

   def _bring_bigchart_pawns(self, removed_list=set(), playing_amount=0):
 chart_pawns = DBPawnBoardChart.query().fetch()
 chart_pawns_filtered = []
 for bigchart_pawn in chart_pawns:
 pawn_number = bigchart_pawn.key.number()
 db = DBPawn.bring_by_number(pawn_number)
 if db is None:
 chart_pawn.key.delete()
 logging.error('DBPawnBoardChart entry is none for chart_pawn = 
%s' % pawn_number)
 if pawn_number in chart_pawns:
 chart_pawn.add(pawn_number)
 else:
 exclude_pawn_numbers = ['1,2,3']
 chart_pawns_filtered.append(chart_pawn)
 pawn_numbers = [x.key.number() for x in chart_pawns_filtered]
 return pawn_numbers, chart_pawns, exclude_pawn_numbers

If the pawn is listed in DBPawnBoardChart it should be added to the game or 
else it should be excluded. I am unable to exclude the pawns,the 
DBPawnBoardChart contains Boardnumber and Pawnnumber. What necessary changes 
should I make ?



Looks to me like an indentation error.  About 9 of those lines probably 
need to be included in the for loop, so they have to be indented further in.


--
DaveA
--
https://mail.python.org/mailman/listinfo/python-list


Re: installing error in python

2015-04-12 Thread Dave Angel

On 04/13/2015 01:38 AM, Mahima Goyal wrote:

error of corrupted file or directory is coming if i am installing
python for 64 bit.



And what OS is that on, and how long has it been since you've done a 
file system check on the drive?



For that matter, what version of Python are you installing, and where 
are you getting the download for it?


--
DaveA
--
https://mail.python.org/mailman/listinfo/python-list


Re: find all multiplicands and multipliers for a number

2015-04-12 Thread Dave Angel

On 04/13/2015 01:25 AM, Paul Rubin wrote:

Dave Angel  writes:

But doesn't math.pow return a float?...
Or were you saying bignums bigger than a float can represent at all?  Like:

x = 2**1 -1  ...
math.log2(x)

1.0


Yes, exactly that.


Well that value x has some 3300 digits, and I seem to recall that float 
only can handle 10**320 or so.  But if the point of all this is to 
decide when to stop dividing, I think our numbers here are somewhere 
beyond the heat death of the universe.


  Thus (not completely tested):


 def isqrt(x):
 def log2(x): return math.log(x,2)  # python 2 compatibility
 if x < 1e9:


Now 10**9 is way below either limit of floating point.  So i still don't 
know which way you were figuring it.  Just off the top of my head, I 
think 10**18 is approx when integers don't get exact representation, and 
10**320 is where you can't represent numbers as floats at all.



return int(math.ceil(math.sqrt(x)))
a,b = divmod(log2(x), 1.0)
c = int(a/2) - 10
d = (b/2 + a/2 - c + 0.001)
# now c+d = log2(x)+0.001, c is an integer, and
 # d is a float between 10 and 11
s = 2**c * int(math.ceil(2**d))
return s

should return slightly above the integer square root of x.  This is just
off the top of my head and maybe it can be tweaked a bit.  Or maybe it's
stupid and there's an obvious better way to do it that I'm missing.



If you're willing to use the 10**320 or whatever it is for the limit, I 
don't see what's wrong with just doing floating point sqrt.  Who cares 
if it can be an exact int, since we're just using it to get an upper limit.


And I can't figure out your code at this hour of night, but it's much 
more complicated than Newton's method would be anyway.


--
DaveA
--
https://mail.python.org/mailman/listinfo/python-list


Re: find all multiplicands and multipliers for a number

2015-04-12 Thread Dave Angel

On 04/12/2015 11:30 PM, Paul Rubin wrote:

Dave Angel  writes:

If I were trying to get a bound for stopping the divide operation, on
a value too large to do exact real representation, I'd try doing just
a few iterations of Newton's method.


Python ninja trick: math.log works on bignums too large to be
represented as floats ;-)



But doesn't math.pow return a float?   Af first crack I figured it was 
because I had supplied math.e as the first argument.  But I have the 
same problem with  (python 3.4)



x = 2596148429267413814265248164610047
print( math.pow(2, math.log2(x)) )
2.596148429267414e+33

Or were you saying bignums bigger than a float can represent at all?  Like:

>>> x = 2**1 -1
>>> len(str(x))
3345
>>> math.log2(x)
1.0
>>> math.pow(2, math.log2(x)//2)
Traceback (most recent call last):
  File "", line 1, in 
OverflowError: math range error


--
DaveA
--
https://mail.python.org/mailman/listinfo/python-list


Re: find all multiplicands and multipliers for a number

2015-04-12 Thread Dave Angel

On 04/12/2015 09:56 PM, Paul Rubin wrote:

Marko Rauhamaa  writes:

And in fact, the sqrt optimization now makes the original version 20%
faster: ...
 bound = int(math.sqrt(n))


That could conceivably fail because of floating point roundoff or
overflow, e.g. fac(3**1000).  A fancier approach to finding the integer
square root might be worthwhile though.



If I were trying to get a bound for stopping the divide operation, on a 
value too large to do exact real representation, I'd try doing just a 
few iterations of Newton's method.  Even if you don't converge it to get 
an exact value, you can arrange that you have a number that's for sure 
no less than the square root.  And you can get pretty close in just a 
few times around.


--
DaveA
--
https://mail.python.org/mailman/listinfo/python-list


Re: try..except with empty exceptions

2015-04-11 Thread Dave Angel

On 04/11/2015 06:14 AM, Dave Angel wrote:

On 04/11/2015 03:11 AM, Steven D'Aprano wrote:

On Sat, 11 Apr 2015 12:23 pm, Dave Angel wrote:


On 04/10/2015 09:42 PM, Steven D'Aprano wrote:

On Sat, 11 Apr 2015 05:31 am, sohcahto...@gmail.com wrote:


It isn't document because it is expected.  Why would the exception get
caught if you're not writing code to catch it?  If you write a
function
and pass it a tuple of exceptions to catch, I'm not sure why you would
expect it to catch an exception not in the tuple.  Just because the
tuple
is empty doesn't mean that it should catch *everything* instead.  That
would be counter-intuitive.


Really? I have to say, I expected it.




I'm astounded at your expectation.  That's like saying a for loop on an
empty list ought to loop on all possible objects in the universe.


Not really.

If we wrote:

 for x in:
 # Missing sequence leads to an infinite loop

*then* your analogy would be excellent, but it isn't. With for loops, we
iterate over each item in the sequence, hence an empty sequence means we
don't iterate at all.

But with try...except, an empty exception list means to catch
*everything*,
not nothing:


No an empty exception list means to catch nothing.  A *missing*
exception list means catch everything, but that's a different syntax


try: ...
except a,b,c: # catches a, b, c

try: ...
except a,b: # catches a, b

try: ...
except a: # catches a


try: ...
except (a,)   #catches a

try: ...
except ()  #catches nothing, as expected



try: ...
except: # catches EVERYTHING, not nothing



Different syntax.  No reason for it to pretend that it's being given an
empty tuple or list.



Putting (a, b, c) into a tuple shouldn't make a difference, and it
doesn't,
unless the tuple is empty. That surprised me.

t = a, b, c
try:
except t:  # same as except a,b,c

t = a, b
try:
except t:  # same as except a,b

t = a,
try:
except t:  # same as except a

t = ()
try:
except t:  # NOT THE SAME as bare except.


Of course not.  It's empty, so it catches nothing. Just like 'for'




I can see the logic behind the current behaviour. If you implement except
clauses like this pseudo-code:


for exc in exceptions:
 if raised_exception matches exc: catch it


then an empty tuple will naturally lead to nothing being caught. That
doesn't mean it isn't surprising from the perspective that an empty
exception list (i.e. a bare except) should be analogous to an empty
tuple.


Why should it??  It's a different syntax, with different rules.  Perhaps
it should have been consistent, but then it's this statement that's
surprising, not the behavior with an empty tuple.





The tuple lists those exceptions you're interested in, and they are
tried, presumably in order, from that collection.  If none of those
match, then the logic will advance to the next except clause.  If the
tuple is empty, then clearly none will match.


Yes, that makes sense, and I agree that it is reasonable behaviour
from one
perspective. But its also reasonable to treat "except ():" as
analogous to
a bare except.

[...]

try:
  spam()
except:
  # Implicitly an empty tuple.


No, an omitted item is not the same as an empty tuple.


You are correct about Python as it actually is, but it could have been
designed so that except (): was equivalent to a bare except.


Only by changing the bare except behavior.





If it were, then
we wouldn't have the problem of bare excepts, which are so tempting to
novices.  There's plenty of precedent in many languages for a missing
item being distinct from anything one could actually supply.


Let us put aside the fact that some people misuse bare excepts, and allow
that there are some uses for it. Now, in Python 2.6 and later, you can
catch everything by catching BaseException. But in older versions, you
could raise strings as well, and the only way to catch everything is
with a
bare except.

If you want to write a function that takes a list of things to catch,
defaulting to "everything", in Python 2.6+ we can write:

def spam(things_to_catch=BaseException):
 try:
 do_stuff()
 except things_to_catch:
 handle_exception()


but in older versions you have to write this:

def spam(things_to_catch=None):
 if things_to_catch is None:
 try:
 do_stuff()
 except:
 handle_exception()
 else:
 try:
 do_stuff()
 except things_to_catch:
 handle_exception()


This violates Don't Repeat Yourself. Any time you have "a missing item
being
distinct from anything one could actually supply", you have a poor
design.


Yep, and it happens all the time.  For example, mylist[a,b,-1]What
value can I use for b to mean the whole list?

There are others more grotesque, but I can't think of any at this moment.



A

Re: try..except with empty exceptions

2015-04-11 Thread Dave Angel

On 04/11/2015 03:11 AM, Steven D'Aprano wrote:

On Sat, 11 Apr 2015 12:23 pm, Dave Angel wrote:


On 04/10/2015 09:42 PM, Steven D'Aprano wrote:

On Sat, 11 Apr 2015 05:31 am, sohcahto...@gmail.com wrote:


It isn't document because it is expected.  Why would the exception get
caught if you're not writing code to catch it?  If you write a function
and pass it a tuple of exceptions to catch, I'm not sure why you would
expect it to catch an exception not in the tuple.  Just because the
tuple
is empty doesn't mean that it should catch *everything* instead.  That
would be counter-intuitive.


Really? I have to say, I expected it.




I'm astounded at your expectation.  That's like saying a for loop on an
empty list ought to loop on all possible objects in the universe.


Not really.

If we wrote:

 for x in:
 # Missing sequence leads to an infinite loop

*then* your analogy would be excellent, but it isn't. With for loops, we
iterate over each item in the sequence, hence an empty sequence means we
don't iterate at all.

But with try...except, an empty exception list means to catch *everything*,
not nothing:


No an empty exception list means to catch nothing.  A *missing* 
exception list means catch everything, but that's a different syntax


try: ...
except a,b,c: # catches a, b, c

try: ...
except a,b: # catches a, b

try: ...
except a: # catches a


try: ...
except (a,)   #catches a

try: ...
except ()  #catches nothing, as expected



try: ...
except: # catches EVERYTHING, not nothing



Different syntax.  No reason for it to pretend that it's being given an 
empty tuple or list.




Putting (a, b, c) into a tuple shouldn't make a difference, and it doesn't,
unless the tuple is empty. That surprised me.

t = a, b, c
try:
except t:  # same as except a,b,c

t = a, b
try:
except t:  # same as except a,b

t = a,
try:
except t:  # same as except a

t = ()
try:
except t:  # NOT THE SAME as bare except.


Of course not.  It's empty, so it catches nothing. Just like 'for'




I can see the logic behind the current behaviour. If you implement except
clauses like this pseudo-code:


for exc in exceptions:
 if raised_exception matches exc: catch it


then an empty tuple will naturally lead to nothing being caught. That
doesn't mean it isn't surprising from the perspective that an empty
exception list (i.e. a bare except) should be analogous to an empty tuple.


Why should it??  It's a different syntax, with different rules.  Perhaps 
it should have been consistent, but then it's this statement that's 
surprising, not the behavior with an empty tuple.






The tuple lists those exceptions you're interested in, and they are
tried, presumably in order, from that collection.  If none of those
match, then the logic will advance to the next except clause.  If the
tuple is empty, then clearly none will match.


Yes, that makes sense, and I agree that it is reasonable behaviour from one
perspective. But its also reasonable to treat "except ():" as analogous to
a bare except.

[...]

try:
  spam()
except:
  # Implicitly an empty tuple.


No, an omitted item is not the same as an empty tuple.


You are correct about Python as it actually is, but it could have been
designed so that except (): was equivalent to a bare except.


Only by changing the bare except behavior.





If it were, then
we wouldn't have the problem of bare excepts, which are so tempting to
novices.  There's plenty of precedent in many languages for a missing
item being distinct from anything one could actually supply.


Let us put aside the fact that some people misuse bare excepts, and allow
that there are some uses for it. Now, in Python 2.6 and later, you can
catch everything by catching BaseException. But in older versions, you
could raise strings as well, and the only way to catch everything is with a
bare except.

If you want to write a function that takes a list of things to catch,
defaulting to "everything", in Python 2.6+ we can write:

def spam(things_to_catch=BaseException):
 try:
 do_stuff()
 except things_to_catch:
 handle_exception()


but in older versions you have to write this:

def spam(things_to_catch=None):
 if things_to_catch is None:
 try:
 do_stuff()
 except:
 handle_exception()
 else:
 try:
 do_stuff()
 except things_to_catch:
 handle_exception()


This violates Don't Repeat Yourself. Any time you have "a missing item being
distinct from anything one could actually supply", you have a poor design.


Yep, and it happens all the time.  For example, mylist[a,b,-1]What 
value can I use for b to mean the whole list?


There are others more grotesque, but I can't think of any at this moment.



Anyway, in modern Python (2.6 onwards), 

Re: try..except with empty exceptions

2015-04-10 Thread Dave Angel

On 04/10/2015 10:38 PM, Rustom Mody wrote:

On Saturday, April 11, 2015 at 7:53:31 AM UTC+5:30, Dave Angel wrote:

On 04/10/2015 09:42 PM, Steven D'Aprano wrote:

On Sat, 11 Apr 2015 05:31 am, sohcahtoa82 wrote:


It isn't document because it is expected.  Why would the exception get
caught if you're not writing code to catch it?  If you write a function
and pass it a tuple of exceptions to catch, I'm not sure why you would
expect it to catch an exception not in the tuple.  Just because the tuple
is empty doesn't mean that it should catch *everything* instead.  That
would be counter-intuitive.


Really? I have to say, I expected it.




I'm astounded at your expectation.  That's like saying a for loop on an
empty list ought to loop on all possible objects in the universe.


To work, this analogy should also have two python syntaxes like this:

"Normal" for-loop:
for var in iterable:
   suite

"Empty" for-loop:
for:
   suite



That tells me nothing about your opinions.  What did you mean by the 
phrase "to work"?  My analogy already works.  The for loop on an empty 
list loops zero times.  Just like try/except on an empty tuple catches 
zero exception types.


As for the separate syntax, that might be an acceptable extension to 
Python.  But it already has a convention for an infinite loop, which is

 while True:
I'm pretty sure do{} works as an infinite loop in C, but perhaps I'm 
remembering some other language where you could omit the conditional.






--
DaveA
--
https://mail.python.org/mailman/listinfo/python-list


Re: try..except with empty exceptions

2015-04-10 Thread Dave Angel

On 04/10/2015 09:42 PM, Steven D'Aprano wrote:

On Sat, 11 Apr 2015 05:31 am, sohcahto...@gmail.com wrote:


It isn't document because it is expected.  Why would the exception get
caught if you're not writing code to catch it?  If you write a function
and pass it a tuple of exceptions to catch, I'm not sure why you would
expect it to catch an exception not in the tuple.  Just because the tuple
is empty doesn't mean that it should catch *everything* instead.  That
would be counter-intuitive.


Really? I have to say, I expected it.




I'm astounded at your expectation.  That's like saying a for loop on an 
empty list ought to loop on all possible objects in the universe.


The tuple lists those exceptions you're interested in, and they are 
tried, presumably in order, from that collection.  If none of those 
match, then the logic will advance to the next except clause.  If the 
tuple is empty, then clearly none will match.



try:
 spam()
except This, That:
 # Implicitly a tuple of two exceptions.
 pass


Compare:

try:
 spam()
except:
 # Implicitly an empty tuple.


No, an omitted item is not the same as an empty tuple.  If it were, then 
we wouldn't have the problem of bare excepts, which are so tempting to 
novices.  There's plenty of precedent in many languages for a missing 
item being distinct from anything one could actually supply.


When there's no tuple specified, it's a different syntax, and the 
semantics are specified separately.




--
DaveA
--
https://mail.python.org/mailman/listinfo/python-list


Re: find all multiplicands and multipliers for a number

2015-04-10 Thread Dave Angel

On 04/10/2015 09:06 PM, Dave Angel wrote:

On 04/10/2015 07:37 PM, ravas wrote:

def m_and_m(dividend):
 rlist = []
 dm = divmod
 end = (dividend // 2) + 1
 for divisor in range(1, end):
 q, r = dm(dividend, divisor)
 if r is 0:
 rlist.append((divisor, q))
 return rlist

print(m_and_m(999))
---
output: [(1, 999), (3, 333), (9, 111), (27, 37), (37, 27), (111, 9),
(333, 3)]
---

How do we describe this function?
Does it have an established name?
What would you call it?
Does 'Rosetta Code' have it or something that uses it?
Can it be written to be more efficient?
What is the most efficient way to exclude the superfluous inverse tuples?
Can it be written for decimal numbers as input and/or output?

Thank you!



I'd call those factors of the original number.  For completeness, I'd
include (999,1) at the end.

If it were my problem, I'd be looking for only prime factors.  Then if
someone wanted all the factors, they could derive them from the primes,
by multiplying all possible combinations.

The program can be sped up most obviously by stopping as soon as you get
a tuple where divisor > q.  At that point, you can just repeat all the
items, reversing divisor and q for each item.  Of course, now I notice
you want to eliminate them.  So just break out of the loop when divisor
 > q.

You can gain some more speed by calculating the square root of the
dividend, and stopping when you get there.

But the real place to get improvement is to only divide by primes,
rather than every possible integer.  And once you've done the division,
let q be the next value for dividend.  So you'll get a list like

[3, 3, 3, 37]

for the value 999

See:
http://rosettacode.org/wiki/Factors_of_an_integer#Python



And

http://rosettacode.org/wiki/Prime_decomposition#Python

There the function that you should grok is  decompose()

--
DaveA
--
https://mail.python.org/mailman/listinfo/python-list


Re: find all multiplicands and multipliers for a number

2015-04-10 Thread Dave Angel

On 04/10/2015 07:37 PM, ravas wrote:

def m_and_m(dividend):
 rlist = []
 dm = divmod
 end = (dividend // 2) + 1
 for divisor in range(1, end):
 q, r = dm(dividend, divisor)
 if r is 0:
 rlist.append((divisor, q))
 return rlist

print(m_and_m(999))
---
output: [(1, 999), (3, 333), (9, 111), (27, 37), (37, 27), (111, 9), (333, 3)]
---

How do we describe this function?
Does it have an established name?
What would you call it?
Does 'Rosetta Code' have it or something that uses it?
Can it be written to be more efficient?
What is the most efficient way to exclude the superfluous inverse tuples?
Can it be written for decimal numbers as input and/or output?

Thank you!



I'd call those factors of the original number.  For completeness, I'd 
include (999,1) at the end.


If it were my problem, I'd be looking for only prime factors.  Then if 
someone wanted all the factors, they could derive them from the primes, 
by multiplying all possible combinations.


The program can be sped up most obviously by stopping as soon as you get 
a tuple where divisor > q.  At that point, you can just repeat all the 
items, reversing divisor and q for each item.  Of course, now I notice 
you want to eliminate them.  So just break out of the loop when divisor > q.


You can gain some more speed by calculating the square root of the 
dividend, and stopping when you get there.


But the real place to get improvement is to only divide by primes, 
rather than every possible integer.  And once you've done the division, 
let q be the next value for dividend.  So you'll get a list like


[3, 3, 3, 37]

for the value 999

See:
http://rosettacode.org/wiki/Factors_of_an_integer#Python


--
DaveA
--
https://mail.python.org/mailman/listinfo/python-list


Re: try..except with empty exceptions

2015-04-10 Thread Dave Angel

On 04/10/2015 04:48 AM, Pavel S wrote:

Hi,

I noticed interesting behaviour. Since I don't have python3 installation here, 
I tested that on Python 2.7.

Well known feature is that try..except block can catch multiple exceptions 
listed in a tuple:


exceptions = ( TypeError, ValueError )

try:
 a, b = None
except exceptions, e:
 print 'Catched error:', e


However when exceptions=(), then try..except block behaves as no try..except 
block.


exceptions = ()

try:
 a, b = None   # <--- the error will not be catched
except exceptions, e:
 print 'Catched error:', e


I found use case for it, e.g. when I want to have a method with 'exceptions' 
argument:


def catch_exceptions(exceptions=()):
   try:
  do_something()
   except exceptions:
  do_something_else()


catch_exceptions()   # catches nothing
catch_exceptions((TypeError,))   # catches TypeError


I believe that behaviour is not documented. What you think?



It's no more surprising than a for loop over an empty tuple or empty 
list.  There's nothing to do, so you do nothing.




--
DaveA
--
https://mail.python.org/mailman/listinfo/python-list


Re: Best search algorithm to find condition within a range

2015-04-09 Thread Dave Angel

On 04/09/2015 08:56 AM, Alain Ketterlin wrote:

Marko Rauhamaa  writes:


Alain Ketterlin :


No, it would not work for signed integers (i.e., with lo and hi of
int64_t type), because overflow is undefined behavior for signed.


All architectures I've ever had dealings with have used 2's-complement
integers. Overflow is well-defined, well-behaved and sign-independent
wrt addition, subtraction and multiplication (but not division).


You are confused: 2's complement does not necessarily mean modular
arithmetic. See, e.g.,
http://stackoverflow.com/questions/16188263/is-signed-integer-overflow-still-undefined-behavior-in-c



So the C standard can specify such things as undefined.  The 
architecture still will do something specific, right or wrong, and 
that's what Marko's claim was about.  The C compiler has separate types 
for unsigned and for signed, while the underlying architecture of every 
twos complement machine I have used did add, subtract, and multiply as 
though the numbers were unsigned (what you call modular arithmetic).


In my microcoding days, the ALU did only unsigned arithmetic, while the 
various status bits had to be interpreted to decide whether a particular 
result was overflow or had a carry.  It was in interpreting those status 
bits that you had to use the knowledge of whether a particular value was 
signed or unsigned.


Then the microcode had to present those values up to the machine 
language level.  And at that level, we had no carry bits, or overflow, 
or anything directly related.


--
DaveA
--
https://mail.python.org/mailman/listinfo/python-list


Re: python implementation of a new integer encoding algorithm.

2015-04-09 Thread Dave Angel

On 04/09/2015 05:33 AM, janhein.vanderb...@gmail.com wrote:

Op donderdag 19 februari 2015 19:25:14 UTC+1 schreef Dave Angel:

I wrote the following pair of functions:


   


Here's a couple of ranges of output, showing that the 7bit scheme does
better for values between 384 and 16379.

Thanks for this test; I obviously should have done it myself.
Please have a look at 
http://optarbvalintenc.blogspot.nl/2015/04/inputs-from-complangpython.html and 
the next two postings.



I still don't see where you have anywhere declared what your goal is. 
Like building a recursive compression scheme [1], if you don't have a 
specific goal in mind, you'll never actually be sure you've achieved it, 
even though you might be able to fool the patent examiners.


Any method of encoding will be worse for some values in order to be 
better for others.  Without specifying a distribution, you cannot tell 
whether a "typical" set of integers is better with one method than another.


For example, if you have uniform distribution of all integer values up 
to 256**n-1, you will not be able to beat a straight n-byte binary storage.


Other than that, I make no claims that any of the schemes previously 
discussed in this thread is unbeatable.


You also haven't made it clear whether you're assuming such a compressed 
bit stream is required to occupy an integral number of bytes.  For 
example, if your goal is to store a bunch of these arbitrary length 
integers in a file of minimal size, then you're talking classic 
compression techniques.  Or maybe you should be minimizing the time to 
convert such a bit stream to and from a conventional one.


I suggest you study Huffman encoding[2], and see what makes it tick.  It 
makes the assumptions that there are a finite set of symbols, and that 
there exists a probability distribution of the likelihood of each 
symbol, and that each takes an integral number of *bits*.


Then study arithmetic-encoding[3], which no longer assumes that a single 
symbol occupy a whole number of bits.  A mind-blowing concept. 
Incidentally, it introduces a "stop-symbol" which is given a very low 
probability.


See the book "Text Compression", 1990, by Bell, Cleary, and Witten.

[1] - http://gailly.net/05533051.html
[2] - http://en.wikipedia.org/wiki/Huffman_coding
[3] - http://en.wikipedia.org/wiki/Arithmetic_coding

If you're going to continue the discussion on python-list, you probably 
should start a new thread, state your actual goals, and



--
DaveA

--
https://mail.python.org/mailman/listinfo/python-list


Re: Best search algorithm to find condition within a range

2015-04-07 Thread Dave Angel

On 04/07/2015 06:35 PM, jonas.thornv...@gmail.com wrote:

Den tisdag 7 april 2015 kl. 21:27:20 UTC+2 skrev Ben Bacarisse:

Ian Kelly  writes:


On Tue, Apr 7, 2015 at 12:55 PM, Terry Reedy  wrote:

On 4/7/2015 1:44 PM, Ian Kelly wrote:


def to_base(number, base):


... digits = []
... while number > 0:
... digits.append(number % base)
... number //= base
... return digits or [0]
...



to_base(2932903594368438384328325832983294832483258958495845849584958458435439543858588435856958650865490,
429496729)


[27626525, 286159541, 134919277, 305018215, 329341598, 48181777,
79384857, 112868646, 221068759, 70871527, 416507001, 31]
About 15 microseconds.



% and probably // call divmod internally and toss one of the results.
Slightly faster (5.7 versus 6.1 microseconds on my machine) is


Not on my box.

$ python3 -m timeit -s "n = 100; x = 42" "n % x; n // x"
1000 loops, best of 3: 0.105 usec per loop
$ python3 -m timeit -s "n = 100; x = 42" "divmod(n,x)"
1000 loops, best of 3: 0.124 usec per loop


I get similar results, but the times switch over when n is large enough
to become a bignum.

--
Ben.


I am not sure you guys realised, that althoug the size of the factors to 
muliply expands according to base^(exp+1) for each digitplace the number of 
comparissons needed to reach the digit place (multiple of base^exp+1) is 
constant with my approach/method.



Baloney.

But even if it were true, a search is slower than a divide.



--
DaveA
--
https://mail.python.org/mailman/listinfo/python-list


Re: Euler module under python 2.7 and 3.4 instalation...

2015-04-07 Thread Dave Angel

On 04/07/2015 11:39 AM, blue wrote:

Dear friends .
I want to install Euler module under python 2.7 and / or 3.4 version.
I try pip and pip 3.4 but seam not working for me.
I need some help with this .
Thank you . Regards.



You don't specify what you mean by Euler.  There are at least 3 things 
you might be talking about:


http://euler.rene-grothmann.de/index.html

https://projecteuler.net/

https://github.com/iKevinY/EulerPy

Without you telling us what you're talking about, we're not likely to be 
able to help.


If you used a pip command(s), please show us exactly what you typed, and 
what result you saw.  Use copy/paste.


And tell us what OS you're doing it on.

--
DaveA
--
https://mail.python.org/mailman/listinfo/python-list


Re: Best search algorithm to find condition within a range

2015-04-07 Thread Dave Angel

On 04/07/2015 11:40 AM, jonas.thornv...@gmail.com wrote:

Den tisdag 7 april 2015 kl. 16:32:56 UTC+2 skrev Ian:

On Tue, Apr 7, 2015 at 3:44 AM,   wrote:



I want todo faster baseconversion for very big bases like base 1 000 000, so 
instead of adding up digits i search it.

I need the fastest algorithm to find the relation to a decimal number.
Digmult is an instance of base at a digitplace (base^x) what i try to find is 
the digit for the below condition is true and the loop break.


*
for (digit=0;digit<=base;digit++) {
if((digit+1)*digmult>decNumber)break;
}
*


   



One could start at half base searching, but then i Think i've read that using 
1/3 closing in faster?


Do you mean binary search? That would be an improvement over the
linear search algorithm you've shown. Whether a trinary search might
be faster would depend on the distribution of the numbers you expect.
If they're evenly distributed, it will be slower.


I Think also i remember that if the search space so big that at least 22 or 23 
guesses, needed.A random Oracle may even faster?

Just pick up a number and get lucky, is it any truth to that?


On average, a random Oracle with a search space of 100 will need
100 guesses.


Well of course you use same principles like a binary search setting min and max, 
closing in on the digit. In this case the searched numbers  > base^exp  and 
number< base^exp+1.

But since the search is within large bases upto 32-bit space, so base 
4294967295 is the biggest allowed. I need to find the nearest less exp in base 
for each (lets call them pseudo digits). But as you see there it will take time 
to add them up. So better doing a binary search, you know min-max half 
(iteration). You can do the same for a random oracle min max within range, and 
if the number of tries in general over 22 i think a random oracle do it better 
than a binary search.

It was a long time since i did this, but i do know there is a threshold where 
searching min max with the oracle will be faster than the binary search.



Once again, there's no point in doing a search, when a simple integer 
divide can give you the exact answer.  And there's probably no point in 
going left to right when right to left would yield a tiny, fast program.


I haven't seen one line of Python from you yet, so perhaps you're just 
yanking our chain.  I'm not here to optimize Javascript code.


Using only Python 3.4 and builtin functions, this function can be 
implemented straightforwardly in 7 lines, assuming number is nonnegative 
integer, and base is positive integer.  It definitely could be done 
smaller, but then the code might be more confusing.


--
DaveA
--
https://mail.python.org/mailman/listinfo/python-list


Re: Best search algorithm to find condition within a range

2015-04-07 Thread Dave Angel

On 04/07/2015 11:05 AM, Grant Edwards wrote:

On 2015-04-07, Chris Angelico  wrote:

On Wed, Apr 8, 2015 at 12:36 AM,   wrote:


Integers are internally assumed to be base 10 otherwise you could not
calculate without giving the base.

All operations on integers addition, subtraction, multiplication and
division assume base 10.


You misunderstand how computers and programming languages work. What
you're seeing there is that *integer literals* are usually in base
10; and actually, I can point to plenty of assembly languages where
the default isn't base 10 (it's usually base 16 (hexadecimal) on IBM
PCs, and probably base 8 (octal) on big iron).


I'd be curious to see some of those assemblers. I've used dozens of
assemblers over the years for everything from microprocessors with a
few hundred bytes of memory to mini-computers and mainframes.  I've
never seen one that didn't default to base 10 for integer literals.

I'm not saying they don't exist, just that it would be interesting to
see an example of one.



I can't "show" it to you, but the assembler used to write microcode on 
the Wang labs 200VP and 2200MVP used hex for all its literals.  I wrote 
the assembler (and matching debugger-assembler), and if we had needed 
other bases I would have taken an extra day to add them in.


That assembler was not available to our customers, as the machine 
shipped with the microcode in readonly form.  Not quite as readonly as 
the Intel processors of today, of course.



Additionally, the MSDOS DEBUG program used hex to enter in its literals, 
if i recall correctly.  Certainly when it disassembled code, it was in hex.



--
DaveA
--
https://mail.python.org/mailman/listinfo/python-list


Re: Best search algorithm to find condition within a range

2015-04-07 Thread Dave Angel

On 04/07/2015 10:36 AM, jonas.thornv...@gmail.com wrote:



All operations on integers addition, subtraction, multiplication and division 
assume base 10.



There have been machines where that was true, but I haven't worked on 
such for about 30 years.  On any machines I've programmed lately, the 
arithmetic is done in binary by default, and only converted to decimal 
for printing.


Not that the internal base is usually relevant, of course.

--
DaveA
--
https://mail.python.org/mailman/listinfo/python-list


Re: Best search algorithm to find condition within a range

2015-04-07 Thread Dave Angel

On 04/07/2015 10:10 AM, jonas.thornv...@gmail.com wrote:

Den tisdag 7 april 2015 kl. 15:30:36 UTC+2 skrev Dave Angel:


   


If that code were in Python, I could be more motivated to critique it.
The whole algorithm could be much simpler.  But perhaps there is some
limitation of javascript that's crippling the code.

How would you do it if you were converting the base by hand?  I
certainly wouldn't be doing any trial and error.  For each pass, I'd
calculate quotient and remainder, where remainder is the digit, and
quotient is the next value you work on.


--
DaveA


I am doing it just like i would do it by hand finding the biggest digit first. 
To do that i need to know nearest base^exp that is less than the actual number. 
Add up the digit (multiply) it to the nearest smaller multiple. Subtract that 
number (base^exp*multiple).

Divide / Scale down the exponent with base. And record the digit.
And start looking for next digit doing same manipulation until remainder = 0.

And that is what i am doing.



Then I don't know why you do the call to reverse() in the top-level code.

If I were doing it, I'd have no trial and error in the code at all. 
Generate the digits right to left, then reverse them before returning.


For example, if you want to convert 378 to base 10 (it's binary 
internally), you'd divide by 10 to get 37, remainder 8.  Save the 8, and 
loop again.  Divide 37 by 10 and get 3, remainder 7.  Save the 7. Divide 
again by 10 and get 0, remainder 3.  Save the 3


Now you have '8', '7', '3'   So you reverse the list, and get
 '3', '7', '8'



--
DaveA
--
https://mail.python.org/mailman/listinfo/python-list


Re: Best search algorithm to find condition within a range

2015-04-07 Thread Dave Angel

On 04/07/2015 05:44 AM, jonas.thornv...@gmail.com wrote:



I want todo faster baseconversion for very big bases like base 1 000 000, so 
instead of adding up digits i search it.


For this and most of the following statements:  I can almost guess what 
you're trying to say.  However, I cannot.  No idea why you're adding up 
digits, that sounds like casting out nines.  And in base-N, that would 
be casting out (N-1)'s.


What's the it you're trying to search?

How do you know the baseconversion is the bottleneck, if you haven't 
written any Python code yet?





I need the fastest algorithm to find the relation to a decimal number.


What relation would that be?  Between what and what?


Digmult is an instance of base at a digitplace (base^x) what i try to find is 
the digit for the below condition is true and the loop break.



You haven't defined a class "Base" yet.  In fact, I don't see any Python 
code in the whole message.




*
for (digit=0;digit<=base;digit++) {
if((digit+1)*digmult>decNumber)break;
}
*





So i am looking for the digit where following condition true.

if((digit)*digmultdecNumber) then BREAK;


You could try integer divide.  That's just something like
 digit = decNumber // digmult
But if you think hard enough you'd realize that




One could start at half base searching, but then i Think i've read that using 
1/3 closing in faster?

I Think also i remember that if the search space so big that at least 22 or 23 
guesses, needed.A random Oracle may even faster?

Just pick up a number and get lucky, is it any truth to that?

Below the actual algorithm.




//CONVERT A DECIMAL NUMBER INTO ANYBASE
function newbase(decNumber,base){
digits=1;
digmult=1;
while(digmult*base<=decNumber){
 digmult=digmult*base
 digits++;
}
digsave=digmult;
while(decNumber>0 || digits>0){
 loop=1;
 digit=0;
for (digit=0;digit<=base;digit++) {
 if((digit+1)*digmult>decNumber)break;
}
 out[digits]=digit;
 digmult=digmult*digit;
 decNumber=decNumber-digmult;
 digsave=digsave/base;
 digmult=digsave;
 digits--;
 }
return out;
}

var out= [];
base=256;
number=854544;
out=newbase(number,base);
out.reverse();
document.write("Number = ",out,"
"); If that code were in Python, I could be more motivated to critique it. The whole algorithm could be much simpler. But perhaps there is some limitation of javascript that's crippling the code. How would you do it if you were converting the base by hand? I certainly wouldn't be doing any trial and error. For each pass, I'd calculate quotient and remainder, where remainder is the digit, and quotient is the next value you work on. -- DaveA -- https://mail.python.org/mailman/listinfo/python-list

Re: Permission denied when opening a file that was created concurrently by os.rename (Windows)

2015-04-05 Thread Dave Angel

On 04/05/2015 01:45 PM, Alexey Izbyshev wrote:

Hello!

I've hit a strange problem that I reduced to the following test case:
* Run several python processes in parallel that spin in the following loop:
while True:
   if os.path.isfile(fname):
 with open(fname, 'rb') as f:
   f.read()
 break
* Then, run another process that creates a temporary file and then
renames it to the name than other processes are expecting
* Now, some of the reading processes occasionally fail with "Permission
denied" OSError

I was able to reproduce it on two Windows 7 64-bit machines. It seems
when the file appears on the filesystem it is still unavailable to
reading, but I have no idea how it can happen. Both source and
destination files are in the same directory, and the destination doesn't
exist before calling os.rename. Everything I could find indicates that
os.rename should be atomic under this conditions even on Windows, so
nobody should be able to observe the destination in unaccessible state.

I know that I can workaround this problem by removing useless
os.path.isfile() check and wrapping open() with try-except, but I'd like
to know the root cause of the problem. Please share you thoughts.

The test case is attached, the main file is test.bat. Python is expected
to be in PATH. Stderr of readers is redirected to *.log. You may need to
run several times to hit the issue.

Alexey Izbyshev,
research assistant,
ISP RAS



The attachment is missing;  please just include it inline, after 
reducing it to a reasonably minimal sample.


My guess is that the process that does the os.rename is not closing the 
original file before renaming it.  So even though the rename is atomic, 
the file is still locked by the first process.




--
DaveA
--
https://mail.python.org/mailman/listinfo/python-list


Re: Strategy/ Advice for How to Best Attack this Problem?

2015-04-03 Thread Dave Angel

On 04/03/2015 08:50 AM, Saran A wrote:

On Friday, April 3, 2015 at 8:05:14 AM UTC-4, Dave Angel wrote:

On 04/02/2015 07:43 PM, Saran A wrote:


I addressed most of the issues. I do admit that, as a novice, I feel beholden 
to the computer - hence the over-engineering.



Should be quite the opposite.  As a novice, you ought to be testing the 
heck out of your functions, worrying about whether they are properly 
named, properly commented, and properly tested.




>>>   os.mkdir('Success')


As you correctly stated:

"

What do you do the second time through this function, when that
directory is already existing?


  copy_and_move_file( 'Failure')


The function takes two arguments, neither of which is likely to be that
string.


  initialize_logger('rootdir/Failure')
  logging.error("Either this file is empty or there are no lines")"



How would I ensure that this s directory is made only once and every file that 
is passeed goes only to 'success' or 'failure'?



Well, you could use an if clause checking with os.exist().  If the 
directory already exists, don't call the mkdir function.  That may not 
be perfect, but it should suffice for an assignment at your level.


Alternatively, you could set a global variable equal to 'Failure' or 
whatever the full path to the directory is going to be, and do a mkdir 
at the beginning of main().   Likewise for success directory, and the 
output text file.  In that case, of course, instead of creating the 
directory, you open the file (for append, of course, so the next run of 
the program doesn't trash the file), and keep the file handle handy.



--
DaveA
--
https://mail.python.org/mailman/listinfo/python-list


Re: New to Programming: Adding custom functions with ipynotify classes

2015-04-03 Thread Dave Angel

On 04/03/2015 07:37 AM, Steven D'Aprano wrote:

On Fri, 3 Apr 2015 12:30 pm, Saran A wrote:


#This helper function returns the length of the file
def file_len(f):
 with open(f) as f:
 for i, l in enumerate(f):
 pass
 return i + 1


Not as given it doesn't. It will raise an exception if the file cannot be
opened, and it will return 1 for any file you can read.

After you fix the incorrect indentation, the function is horribly
inefficient. Instead, use this:

import os
os.stat(filename).st_size




No, he actually wants the number of records in the file.  So he still 
needs to read the whole file.


Naturally I figured that out from the phrase in the spec:
   "indicating the total number of records processed"
not from the comment in the code.


--
DaveA
--
https://mail.python.org/mailman/listinfo/python-list


Re: Strategy/ Advice for How to Best Attack this Problem?

2015-04-03 Thread Dave Angel

On 04/02/2015 07:43 PM, Saran A wrote:
   


I debugged and rewrote everything. Here is the full version. Feel free to tear 
this apart. The homework assignment is not due until tomorrow, so I am 
currently also experimenting with pyinotify as well. I do have questions 
regarding how to make this function compatible with the ProcessEvent Class. I 
will create another post for this.

What would you advise in regards to renaming the inaptly named dirlist?


Asked and answered.  You called it path at one point.  dir_name would 
also be good.  The point is that if you have a good name, you're less 
likely to have unreasonable code processing that name.  For example,




# # # Without data to examine here, I can only guess based on this 
requirement's language that
# # fixed records are in the input.  If so, here's a slight revision to the 
helper functions that I wrote earlier which
# # takes the function fileinfo as a starting point and demonstrates calling a 
function from within a function.
# I tested this little sample on a small set of files created with MD5 
checksums.  I wrote the Python in such a way as it
# would work with Python 2.x or 3.x (note the __future__ at the top).

# # # There are so many wonderful ways of failure, so, from a development 
standpoint, I would probably spend a bit
# # more time trying to determine which failure(s) I would want to report to 
the user, and how (perhaps creating my own Exceptions)

# # # The only other comments I would make are about safe-file handling.

# # #   #1:  Question: After a user has created a file that has failed (in
# # #processing),can the user create a file with the same name?
# # #If so, then you will probably want to look at some sort
# # #of file-naming strategy to avoid overwriting evidence of
# # #earlier failures.

# # # File naming is a tricky thing.  I referenced the tempfile module [1] and 
the Maildir naming scheme to see two different
# # types of solutions to the problem of choosing a unique filename.

## I am assuming that all of my files are going to be specified in unicode


## Utilized Spyder's Scientific Computing IDE to debug, check for indentation 
errors and test function suite

from __future__ import print_function

import os.path
import time
import logging


def initialize_logger(output_dir):

   

I didn't ever bother to read the body of this function, since a simple

  print(mydata, file=mylog_file)

will suffice to add data to the chosen text file.  Why is it constantly 
getting more complex, to solve a problem that was simple in the beginning?





#Returns filename, rootdir and filesize

def fileinfo(f):
 filename = os.path.basename(f)
 rootdir = os.path.dirname(f)
 filesize = os.path.getsize(f)
 return filename, rootdir, filesize

#returns length of file
def file_len(f):
 with open(f) as f:
 for i, l in enumerate(f):
 pass
 return i + 1


Always returns 1 or None.  Check the indentation, and test to see what 
it does for empty file, for a file with one line, and for a file with 
more than one line.




#attempts to copy file and move file to it's directory
def copy_and_move_file(src, dest):


Which is it, are you trying to copy it, or move it?  Pick one and make a 
good function name that shows your choice.



 try:
 os.rename(src, dest)


Why are you using rename, when you're trying to move the file?  Take a 
closer look at shutil, and see if it has a function that does it safer 
than rename.  The function you need uses rename, when it'll work, and 
does it other ways when rename will not.



 # eg. src and dest are the same file
 except IOError as e:
 print('Error: %s' % e.strerror)


A try/except that doesn't try to correct the problem is not generally 
useful.  Figure out what could be triggering the exception, and how 
you're going to handle it.  If it cannot be handled, terminate the program.


For example, what if you don't have permissions to modify one of the 
specified directories?  You can't get any useful work done, so you 
should notify the user and exit.  An alternative is to produce 50 
thousand reports of each file you've got, telling how it succeeded or 
failed, over and over.




path = "."
dirlist = os.listdir(path)

def main(dirlist):
 before = dict([(f, 0) for f in dirlist])


Since dirlist is a path, it's a string.  So you're looping through the 
characters of the name of the path.  I still don't have a clue what the 
dict is supposed to mean here.



 while True:
 time.sleep(1) #time between update check


That loop goes forever, so the following code will never run.


 after = dict([(f, None) for f in dirlist])


Once again, you're looping through the letters of the directory name. 
Or if dirlist is really a list, and you're deciding that's what it 
should be, then of course after will be identical to before.



 added = [f for f in after if not f 

Re: Strategy/ Advice for How to Best Attack this Problem?

2015-04-02 Thread Dave Angel

On 04/02/2015 09:06 AM, Saran A wrote:



Thanks for your help on this homework assignment. I started from scratch last 
night. I have added some comments that will perhaps help clarify my intentions 
and my thought process. Thanks again.

from __future__ import print_function


I'll just randomly comment on some things I see here.  You've started 
several threads, on two different forums, so it's impractical to figure 
out what's really up.



   


#Helper Functions for the Success and Failure Folder Outcomes, respectively

 def file_len(filename):


This is an indentation error, as you forgot to start at the left margin


 with open(filename) as f:
 for i, l in enumerate(f):
 pass
 return i + 1


 def copy_and_move_File(src, dest):


ditto


 try:
 shutil.rename(src, dest)


Is there a reason you don't use the move function?  rename won't work if 
the two directories aren't on the same file system.



 # eg. src and dest are the same file
 except shutil.Error as e:
 print('Error: %s' % e)
 # eg. source or destination doesn't exist
 except IOError as e:
 print('Error: %s' % e.strerror)


# Call main(), with a loop that calls # validate_files(), with a sleep after 
each pass. Before, my present #code was assuming all filenames come directly 
from the commandline.  There was no actual searching #of a directory.

# I am assuming that this is appropriate since I moved the earlier versions of 
the files.
# I let the directory name be the argument to main, and let main do a dirlist 
each time through the loop,
# and pass the corresponding list to validate_files.


path = "/some/sample/path/"
dirlist = os.listdir(path)
before = dict([(f, None) for f in dirlist)

#Syntax Error? before = dict([(f, None) for f in dirlist)
  ^
SyntaxError: invalid syntax


Look at the line in question. There's an unmatched set of brackets.  Not 
that it matters, since you don't need these 2 lines for anything.  See 
my comments on some other forum.




def main(dirlist):


bad name for a directory path variable.


 while True:
 time.sleep(10) #time between update check


Somewhere inside this loop, you want to obtain a list of files in the 
specified directory.  And you want to do something with that list.  You 
don't have to worry about what the files were last time, because 
presumably those are gone.  Unless in an unwritten part of the spec, 
you're supposed to abort if any filename is repeated over time.




 after = dict([(f, None) for f in dirlist)
 added = [f for f in after if not f in before]
 if added:
 print('Sucessfully added new file - ready to validate')
   add return statement here to pass to validate_files
if __name__ == "__main__":
 main()


You'll need an argument to call main()




#check for record time and record length - logic to be written to either pass 
to Failure or Success folder respectively

def validate_files():


Where are all the parameters to this function?


 creation = time.ctime(os.path.getctime(added))
 lastmod = time.ctime(os.path.getmtime(added))



#Potential Additions/Substitutions  - what are the implications/consequences 
for this

def move_to_failure_folder_and_return_error_file():
 os.mkdir('Failure')
 copy_and_move_File(filename, 'Failure')
 initialize_logger('rootdir/Failure')
 logging.error("Either this file is empty or there are no lines")


def move_to_success_folder_and_read(f):
 os.mkdir('Success')
 copy_and_move_File(filename, 'Success')
 print("Success", f)
 return file_len()

#This simply checks the file information by name--> is this needed anymore?

def fileinfo(file):
 filename = os.path.basename(f)
 rootdir = os.path.dirname(f)
 filesize = os.path.getsize(f)
 return filename, rootdir, filesize

if __name__ == '__main__':
import sys
validate_files(sys.argv[1:])

# -- end of file




--
DaveA
--
https://mail.python.org/mailman/listinfo/python-list


Re: Strategy/ Advice for How to Best Attack this Problem?

2015-04-01 Thread Dave Angel

On 04/01/2015 09:43 AM, Saran A wrote:

On Tuesday, March 31, 2015 at 9:19:37 AM UTC-4, Dave Angel wrote:

On 03/31/2015 07:00 AM, Saran A wrote:

  > @DaveA: This is a homework assignment.  Is it possible that you
could provide me with some snippets or guidance on where to place your
suggestions (for your TO DOs 2,3,4,5)?
  >



On Monday, March 30, 2015 at 2:36:02 PM UTC-4, Dave Angel wrote:




It's missing a number of your requirements.  But it's a start.

If it were my file, I'd have a TODO comment at the bottom stating known
changes that are needed.  In it, I'd mention:

1) your present code is assuming all filenames come directly from the
commandline.  No searching of a directory.

2) your present code does not move any files to success or failure
directories



In function validate_files()
Just after the line
  print('success with %s on %d reco...
you could move the file, using shutil.  Likewise after the failure print.


3) your present code doesn't calculate or write to a text file any
statistics.


You successfully print to sys.stderr.  So you could print to some other
file in the exact same way.



4) your present code runs once through the names, and terminates.  It
doesn't "monitor" anything.


Make a new function, perhaps called main(), with a loop that calls
validate_files(), with a sleep after each pass.  Of course, unless you
fix TODO#1, that'll keep looking for the same files.  No harm in that if
that's the spec, since you moved the earlier versions of the files.

But if you want to "monitor" the directory, let the directory name be
the argument to main, and let main do a dirlist each time through the
loop, and pass the corresponding list to validate_files.



5) your present code doesn't check for zero-length files



In validate_and_process_data(), instead of checking filesize against
ftell, check it against zero.


I'd also wonder why you bother checking whether the
os.path.getsize(file) function returns the same value as the os.SEEK_END
and ftell() code does.  Is it that you don't trust the library?  Or that
you have to run on Windows, where the line-ending logic can change the
apparent file size?

I notice you're not specifying a file mode on the open.  So in Python 3,
your sizes are going to be specified in unicode characters after
decoding.  Is that what the spec says?  It's probably safer to
explicitly specify the mode (and the file encoding if you're in text).

I see you call strip() before comparing the length.  Could there ever be
leading or trailing whitespace that's significant?  Is that the actual
specification of line size?

--
DaveA






I ask this because I have been searching fruitlessly through for some time and 
there are so many permutations that I am bamboozled by which is considered best 
practice.

Moreover, as to the other comments, those are too specific. The scope of the 
assignment is very limited, but I am learning what I need to look out or ask 
questions regarding specs - in the future.




--
DaveA


@DaveA

My most recent commit 
(https://github.com/ahlusar1989/WGProjects/blob/master/P1version2.0withassumptions_mods.py)
 has more annotations and comments for each file.


Perhaps you don't realize how github works.  The whole point is it 
preserves the history of your code, and you use the same filename for 
each revision.


Or possibly it's I that doesn't understand it.  I use git, but haven't 
actually used github for my own code.




I have attempted to address the functional requirements that you brought up:

1) Before, my present code was assuming all filenames come directly from the 
commandline.  No searching of a directory. I think that I have addressed this.



Have you even tried to run the code?  It quits immediately with an 
exception since your call to main() doesn't pass any arguments, and main 
requires one.


> def main(dirslist):
> while True:
> for file in dirslist:
>return validate_files(file)
>time.sleep(5)

In addition, you aren't actually doing anything to find what the files 
in the directory are.  I tried to refer to dirlist, as a hint.  A 
stronger hint:  look up  os.listdir()


And that list of files has to change each time through the while loop, 
that's the whole meaning of scanning.  You don't just grab the names 
once, you look to see what's there.


The next thing is that you're using a variable called 'file', while 
that's a built-in type in Python.  So you really want to use a different 
name.


Next, you have a loop through the magical dirslist to get individual 
filenames, but then you call the validate_files() function with a single 
file, but that function is expecting a list of filenames.  One or the 
other has to change.


Next, you return from main after validating the first

Re: generator/coroutine terminology

2015-03-31 Thread Dave Angel

On 03/31/2015 09:18 AM, Albert van der Horst wrote:

In article <55062bda$0$12998$c3e8da3$54964...@news.astraweb.com>,
Steven D'Aprano   wrote:




The biggest difference is syntactic. Here's an iterator which returns a
never-ending sequence of squared numbers 1, 4, 9, 16, ...

class Squares:
def __init__(self):
self.i = 0
def __next__(self):
self.i += 1
return self.i**2
def __iter__(self):
return self


You should give an example of usage. As a newby I'm not up to
figuring out the specification from source for
something built of the mysterious __ internal
thingies.
(I did experiment with Squares interactively. But I didn't get
further than creating a Squares object.)



He did say it was an iterator.  So for a first try, write a for loop:

class Squares:
   def __init__(self):
   self.i = 0
   def __next__(self):
   self.i += 1
   return self.i**2
   def __iter__(self):
   return self

for i in Squares():
print(i)
if i > 50:
break

print("done")







Here's the same thing written as a generator:

def squares():
i = 1
while True:
yield i**2
i += 1


Four lines, versus eight. The iterator version has a lot of boilerplate
(although some of it, the two-line __iter__ method, could be eliminated if
there was a standard Iterator builtin to inherit from).


--
DaveA
--
https://mail.python.org/mailman/listinfo/python-list


Re: Strategy/ Advice for How to Best Attack this Problem?

2015-03-31 Thread Dave Angel

On 03/31/2015 07:00 AM, Saran A wrote:

> @DaveA: This is a homework assignment.  Is it possible that you 
could provide me with some snippets or guidance on where to place your 
suggestions (for your TO DOs 2,3,4,5)?

>



On Monday, March 30, 2015 at 2:36:02 PM UTC-4, Dave Angel wrote:




It's missing a number of your requirements.  But it's a start.

If it were my file, I'd have a TODO comment at the bottom stating known
changes that are needed.  In it, I'd mention:

1) your present code is assuming all filenames come directly from the
commandline.  No searching of a directory.

2) your present code does not move any files to success or failure
directories



In function validate_files()
Just after the line
print('success with %s on %d reco...
you could move the file, using shutil.  Likewise after the failure print.


3) your present code doesn't calculate or write to a text file any
statistics.


You successfully print to sys.stderr.  So you could print to some other 
file in the exact same way.




4) your present code runs once through the names, and terminates.  It
doesn't "monitor" anything.


Make a new function, perhaps called main(), with a loop that calls 
validate_files(), with a sleep after each pass.  Of course, unless you 
fix TODO#1, that'll keep looking for the same files.  No harm in that if 
that's the spec, since you moved the earlier versions of the files.


But if you want to "monitor" the directory, let the directory name be 
the argument to main, and let main do a dirlist each time through the 
loop, and pass the corresponding list to validate_files.




5) your present code doesn't check for zero-length files



In validate_and_process_data(), instead of checking filesize against 
ftell, check it against zero.



I'd also wonder why you bother checking whether the
os.path.getsize(file) function returns the same value as the os.SEEK_END
and ftell() code does.  Is it that you don't trust the library?  Or that
you have to run on Windows, where the line-ending logic can change the
apparent file size?

I notice you're not specifying a file mode on the open.  So in Python 3,
your sizes are going to be specified in unicode characters after
decoding.  Is that what the spec says?  It's probably safer to
explicitly specify the mode (and the file encoding if you're in text).

I see you call strip() before comparing the length.  Could there ever be
leading or trailing whitespace that's significant?  Is that the actual
specification of line size?

--
DaveA






I ask this because I have been searching fruitlessly through for some time and 
there are so many permutations that I am bamboozled by which is considered best 
practice.

Moreover, as to the other comments, those are too specific. The scope of the 
assignment is very limited, but I am learning what I need to look out or ask 
questions regarding specs - in the future.




--
DaveA
--
https://mail.python.org/mailman/listinfo/python-list


Re: Strategy/ Advice for How to Best Attack this Problem?

2015-03-30 Thread Dave Angel

On 03/30/2015 12:45 PM, Saran A wrote:

On Sunday, March 29, 2015 at 10:04:45 PM UTC-4, Chris Angelico wrote:

On Mon, Mar 30, 2015 at 12:08 PM, Paul Rubin  wrote:

Saran Ahluwalia  writes:

cross-platform...
* Monitors a folder for files that are dropped throughout the day


I don't see a cross-platform way to do that other than by waking up and
scanning the folder every so often (once a minute, say).  The Linux way
is with inotify and there's a Python module for it (search terms: python
inotify).  There might be comparable but non-identical interfaces for
other platforms.


All too often, "cross-platform" means probing for one option, then
another, then another, and using whichever one you can. On Windows,
there's FindFirstChangeNotification and ReadDirectoryChanges, which
Tim Golden wrote about, and which I coded up into a teleporter for
getting files out of a VM automatically:

http://timgolden.me.uk/python/win32_how_do_i/watch_directory_for_changes.html
https://github.com/Rosuav/shed/blob/master/senddir.py

ChrisA


@Dave, Chris, Paul and Dennis: Thank you for resources and the notes regarding 
what I should keep in mind. I have an initial commit: 
https://github.com/ahlusar1989/IntroToPython/blob/master/Project1WG_with_assumptions_and_comments.py

I welcome your thoughts on this



It's missing a number of your requirements.  But it's a start.

If it were my file, I'd have a TODO comment at the bottom stating known 
changes that are needed.  In it, I'd mention:


1) your present code is assuming all filenames come directly from the 
commandline.  No searching of a directory.


2) your present code does not move any files to success or failure 
directories


3) your present code doesn't calculate or write to a text file any 
statistics.


4) your present code runs once through the names, and terminates.  It 
doesn't "monitor" anything.


5) your present code doesn't check for zero-length files

I'd also wonder why you bother checking whether the 
os.path.getsize(file) function returns the same value as the os.SEEK_END 
and ftell() code does.  Is it that you don't trust the library?  Or that 
you have to run on Windows, where the line-ending logic can change the 
apparent file size?


I notice you're not specifying a file mode on the open.  So in Python 3, 
your sizes are going to be specified in unicode characters after 
decoding.  Is that what the spec says?  It's probably safer to 
explicitly specify the mode (and the file encoding if you're in text).


I see you call strip() before comparing the length.  Could there ever be 
leading or trailing whitespace that's significant?  Is that the actual 
specification of line size?


--
DaveA
--
https://mail.python.org/mailman/listinfo/python-list


Re: Sudoku solver

2015-03-30 Thread Dave Angel

On 03/30/2015 03:29 AM, Ian Kelly wrote:

On Mon, Mar 30, 2015 at 1:13 AM, Christian Gollwitzer  wrote:

Am 30.03.15 um 08:50 schrieb Ian Kelly:


On Sun, Mar 29, 2015 at 12:03 PM, Marko Rauhamaa  wrote:


Be careful with the benchmark comparisons. Ian's example can be solved
with the identical algorithm in eight different ways (four corners, left
or right). I ran the example with my recent Python solver and got these
times in the eight cases:

  884   s
2.5 s
   13   s
  499   s
5.9 s
  128   s
 1360   s
   36   s



That sounds to me like either a transcription error was made to the
puzzle at some point, or there's something wrong with your solver. The
whole point of that example was that it was a puzzle with the minimum
number of clues to specify a unique solution.


I think Marko meant, that if he creates symmetrically equivalent puzzles by
rotating / mirroring the grid, he gets vastly different execution times, but
ends up with the same solution.


That makes sense, but it is true for all puzzles that there are eight
possible orientations (since it's impossible for a puzzle solution to
be symmetric), and the wording made it sound like he was describing a
property specific to the puzzle that I posted.



But for some puzzles, the 8 timings may be much closer.  Or maybe even 
further apart.


Incidentally, there are many other variants of the same puzzle that 
might matter, beyond those 8.


The digits can all be crypto'ed   Like replace all 4 with 8, etc. 
Probably won't matter for any realistic algorithm.


The columns can be reordered, in at least some ways.  For example, if 
the first and second columns are swapped, it's a new puzzle, equivalent. 
 Likewise certain rows.


The relationship between row, column and box can be rearranged.  Some of 
these are already covered by the rotations proposed earlier, where for a 
90 degree rotate, row becomes column and column becomes row.  But in a 
similar way each box could become a column, and so on.


All of these rearrangeements will change the order that an algorithm 
might choose to examine things, and thus affect timings (but not the 
solution).


When I made my own solver years ago, I considered the puzzle to have 9 
columns, 9 rows, and 9 boxes.  So these 27 lists of 9 could be analyzed. 
 I just came up with a fast way to map those 243 cells back and forth 
with the original 81.  At that point, it no longer mattered which things 
were rows and which were columns or boxes.



--
DaveA
--
https://mail.python.org/mailman/listinfo/python-list


Re: Addendum to Strategy/ Advice for How to Best Attack this Problem?

2015-03-29 Thread Dave Angel

On 03/29/2015 07:37 AM, Saran Ahluwalia wrote:

On Sunday, March 29, 2015 at 7:33:04 AM UTC-4, Saran Ahluwalia wrote:

Below are the function's requirements. I am torn between using the OS module or some 
other quick and dirty module. In addition, my ideal assumption that this could be 
cross-platform. "Records" refers to contents in a file. What are some 
suggestions from the Pythonistas?

* Monitors a folder for files that are dropped throughout the day

* When a file is dropped in the folder the program should scan the file

o IF all the records in the file have the same length

o THEN the file should be moved to a "success" folder and a text file written 
indicating the total number of records processed

o IF the file is empty OR the records are not all of the same length

o THEN the file should be moved to a "failure" folder and a text file written 
indicating the cause for failure (for example: Empty file or line 100 was not the same 
length as the rest).


Below are some functions that I have been playing around with. I am not sure 
how to create a functional program from each of these constituent parts. I 
could use decorators or simply pass a function within another function.


Your problem isn't complicated enough to either need function objects or 
decorators.  You might want to write a generator function, but even that 
seems overkill for the problem as stated.  Just write the code, 
top-down, with dummy bodies containing stub code.  Then fill it in from 
bottom up, with unit tests for each completed function.


More complex problems can justify a different approach, but you don't 
need to use every trick in the arsenal.




[code]
import time
import fnmatch
import os
import shutil



If you have code fragments that aren't going to be used, don't write 
them as top-level code.  Either move them to another file, or at least 
enclose them in a function with a name like   dummy_do_not_use()


My own convention for that is to suffix the function name with a bunch 
of uppercase ZZZ's  That way the name jumps out at me so I'll recognize 
it, and I can be sure I'll never actually call it.




#If you want to write to a file, and if it doesn't exist, do this:

if not os.path.exists(filepath):
 f = open(filepath, 'w')

#If you want to read a file, and if it exists, do the following:

try:
 f = open(filepath)
except IOError:
 print 'I will be moving this to the '


#Changing a directory to "/home/newdir"
os.chdir("/home/newdir")


As Peter said, chdir can be very troublesome.  Avoid at almost all 
costs.  As you've done elsewhere, use os.path.join() to combine 
directory paths with relative filenames.




def move(src, dest):
 shutil.move(src, dest)

def fileinfo(file):
 filename = os.path.basename(file)
 rootdir = os.path.dirname(file)
 lastmod = time.ctime(os.path.getmtime(file))
 creation = time.ctime(os.path.getctime(file))
 filesize = os.path.getsize(file)

 print "%s**\t%s\t%s\t%s\t%s" % (rootdir, filename, lastmod, creation, 
filesize)

searchdir = r'D:\Your\Directory\Root'
matches = []

def search
for root, dirnames, filenames in os.walk(searchdir):


Why are you using a directory tree when your "spec" said the files would 
be in a specific directory?



 ##  for filename in fnmatch.filter(filenames, '*.c'):
 for filename in filenames:
 ##  matches.append(os.path.join(root, filename))
 ##print matches
 fileinfo(os.path.join(root, filename))


def get_files(src_dir):
# traverse root directory, and list directories as dirs and files as files
 for root, dirs, files in os.walk(src_dir):
 path = root.split('/')
 for file in files:
 process(os.path.join(root, file))
 os.remove(os.path.join(root, file))


Probably you shouldn't have os.remove in the code till the stuff around 
it has been carefully tested.  Besides, nothing in the spec says you're 
going to remove any files.




def del_dirs(src_dir):
 for dirpath, _, _ in os.walk(src_dir, topdown=False):  # Listing the files
 if dirpath == src_dir:
 break
 try:
 os.rmdir(dirpath)
 except OSError as ex:
 print(ex)


def main():
 get_files(src_dir)
 del_dirs(src_dir)



Your description says "monitor".  That implies to me an ongoing process, 
or a loop.   You probably want something like:


def main():
while True:
process_files(directory_name)
sleep(1)



if __name__ == "__main__":
 main()


[/code]




--
DaveA
--
https://mail.python.org/mailman/listinfo/python-list


Re: Sudoku solver

2015-03-27 Thread Dave Angel

On 03/27/2015 09:56 AM, Marko Rauhamaa wrote:

"Frank Millman" :


So what I am talking about is called a "satisfactory" puzzle, which is
a subset of a "proper" puzzle.


That is impossible to define, though, because some people are mental
acrobats and can do a lot of deep analysis in their heads. What's
satisfactory to you may not be satisfactory to me.

Besides, looking for "satisfactory" patterns can involve a truckload of
trial and error.



I know, let's use "regular expressions"  


--
DaveA
--
https://mail.python.org/mailman/listinfo/python-list


Re: Sudoku solver

2015-03-27 Thread Dave Angel

On 03/27/2015 09:35 AM, Frank Millman wrote:


"Dave Angel"  wrote in message
news:551557b3.5090...@davea.name...


But now I have to disagree about "true Sudoku puzzle."  As we said
earlier, it might make sense to say that puzzles that cannot be solved
that way are not reasonable ones to put in a human Sudoku book.  But why
isn't it a "true Sudoku puzzle"?



It seems you are correct.

According to Wikipedia http://en.wikipedia.org/wiki/Glossary_of_Sudoku -

A puzzle is a partially completed grid. The initially defined values are
known as givens or clues. A proper puzzle has a single (unique) solution. A
proper puzzle that can be solved without trial and error (guessing) is known
as a satisfactory puzzle. An irreducible puzzle (a.k.a. minimum puzzle) is a
proper puzzle from which no givens can be removed leaving it a proper puzzle
(with a single solution). It is possible to construct minimum puzzles with
different numbers of givens. The minimum number of givens refers to the
minimum over all proper puzzles and identifies a subset of minimum puzzles.

So what I am talking about is called a "satisfactory" puzzle, which is a
subset of a "proper" puzzle.



Thanks for the wikipedia reference.  Now we're in violent agreement, and 
even have a vocabulary to use for that agreement.



--
DaveA
--
https://mail.python.org/mailman/listinfo/python-list


Re: Sudoku solver

2015-03-27 Thread Dave Angel

On 03/27/2015 09:25 AM, Chris Angelico wrote:

On Sat, Mar 28, 2015 at 12:14 AM, Dave Angel  wrote:

But now I have to disagree about "true Sudoku puzzle."  As we said earlier,
it might make sense to say that puzzles that cannot be solved that way are
not reasonable ones to put in a human Sudoku book.  But why isn't it a "true
Sudoku puzzle"?

Isn't the fact that one resorts to trial and error simply a consequence of
the fact that he/she has run out of ideas for more direct rules and the data
structures to support them?

The simpler rules can be built around a list of possible values for each
cell.  More complex rules can have a more complex data structure for each
cell/row/column/box.  And when you run out of ideas for all those, you use
guess and backtrack, where the entire board's state is your data structure.


At that point, it may make a fine mathematical curiosity, but it's not
really a fun puzzle any more.


That's why I addressed my comments at Frank.  You and I are already in 
rough agreement about what makes a human game worthwhile: it has to be 
easy enough to be solvable, and hard enough to be challenging.  Those 
cutoffs differ from one person to another, and from one age group to 
another.  At one time (50+ years ago) I though Tic-Tac-Toe was tricky 
enough to be fun, but now it's always a draw, and only playable against 
a kid.  On the other hand, I play some "games" which I can only solve 
with the aid of a computer.  Is that "cheating"?  Not for some games.  I 
have some challenges for which I need/prefer to use a wrench, or a 
screwdriver, or a lawnmower.  That doesn't make them less fun, just 
different fun.


But I took Frank's comments as defining the "fine mathematical 
curiosity," and I have more interest in those than I generally do in 
"games".


Many games that I hear people talking about, I've never even tried.
I have a "TV set" which has never been hooked up to an antenna or cable. 
 Only to CD/DVD/BluRay/computer/tablet/cellphone.  So I'm a bit 
strange.  I still enjoy riding a motorcycle,  walking on the beach, or 
seeing a sunset from the backyard.




--
DaveA
--
https://mail.python.org/mailman/listinfo/python-list


Re: Sudoku solver

2015-03-27 Thread Dave Angel

On 03/27/2015 05:25 AM, Chris Angelico wrote:

On Fri, Mar 27, 2015 at 8:07 PM, Frank Millman  wrote:

There seems to be disagreement over the use of the term 'trial and error'.
How about this for a revised wording -

"It should be possible to reach that solution by a sequence of logical
deductions. Each step in the sequence must uniquely identify the contents of
at least one cell based on the information available. Each time a cell is
identified, that adds to the information available which can then be used to
identify the contents of further cells. This process continues until the
contents of all cells have been identified."

Any puzzle that cannot be solved by this method does not qualify as a true
Sudoku puzzle.


That's reasonable wording. Another way to differentiate between the
"trial and error" that we're objecting to and the "logical deduction"
that we're liking: Avoid backtracking. That is, you never guess a
number and see if the puzzle's solvable, and backtrack if it isn't; at
every step, the deductions you make are absolute certainties.

They might, in some cases, not result in actual result numbers (you
might deduce that "either this cell or that cell is a 2"), but it's a
certainty, based solely on the clue numbers given.



I like that wording.  It fits what I meant by trial and error.

Frank:

But now I have to disagree about "true Sudoku puzzle."  As we said 
earlier, it might make sense to say that puzzles that cannot be solved 
that way are not reasonable ones to put in a human Sudoku book.  But why 
isn't it a "true Sudoku puzzle"?


Isn't the fact that one resorts to trial and error simply a consequence 
of the fact that he/she has run out of ideas for more direct rules and 
the data structures to support them?


The simpler rules can be built around a list of possible values for each 
cell.  More complex rules can have a more complex data structure for 
each cell/row/column/box.  And when you run out of ideas for all those, 
you use guess and backtrack, where the entire board's state is your data 
structure.

--
DaveA
--
https://mail.python.org/mailman/listinfo/python-list


  1   2   3   4   5   6   7   8   9   10   >