Re: Pickle __getstate__ __setstate__ and restoring n/w - beazley pg 172

2016-03-10 Thread dieter
"Veek. M"  writes:
> ...
> what i wanted to know was, x = Client('192.168.0.1') will create an 
> object 'x' with the IP inside it. When I do:
> pickle.dump(x)
> pickle doesn't know where in the object the IP is, so he'll call 
> __getstate__ and expect the return value to be the IP, right?

It does not expect anything. It pickles, whatever "__getstate__"
returns. The unpickling will fail (either
explicitely or implicitely), when "__setstate__" does not "fit"
with "__getstate__" or does not restore the state in the way
you expect.

> Similarly, whilst unpickling, __setstate__ will be called in a manner 
> similar to this:
> x.__setstate__(self, unpickledIP)

Yes.

> __setstate__ can then populate 'x' by doing 
> self.x = str(unpickledIP)
>
> the type info is not lost during pickling is it, therefore 'str' is not 
> required is it? 

Yes.

Note however, that "__setstate__" is called automatically
during "unpickling". No need that you do this yourself.


You can also easily try out things yourself (e.g. in
an interactive Python session). The builtin "vars" allows you
to look into an object. As an example, you could do:

   client = Client(...)
   vars(client)
   pickle.dump(client, open(fn, "wb"))
   
   recreated_client = pickle.load(open(fn, "rb"))
   vars(recreated_client)

and then compare the two "vars" results.

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Installing ibm_db package on Windows 7, 64-bit problem

2016-03-10 Thread Chris Angelico
On Thu, Mar 10, 2016 at 6:46 PM, Alexander Shmugliakov
 wrote:
> Hello, has anybody successfully installed ibm_db package in the 32-bit Python 
> 3.5.0 environment? This is the error messages I'm receiving:
> I know that the message about vcvarsall.bat is related to the Visual Studio 
> configuration -- I don't have it on my computer. I have no 64-bit IBM Data 
> Server Driver installed on my machine. Any help will be *greatly* appreciated.
>
> '--'
> C:\Program Files (x86)\Python 3.5>easy_install ibm_db
> Searching for ibm-db
> Reading https://pypi.python.org/simple/ibm_db/
> Best match: ibm-db 2.0.6
> Downloading 
> https://pypi.python.org/packages/source/i/ibm_db/ibm_db-2.0.6.tar.gz#md5=f6b80e1489167a141ebf8978c37ca398
> Processing ibm_db-2.0.6.tar.gz
> Writing 
> C:\Users\shmuglak\AppData\Local\Temp\easy_install-yijpttuo\ibm_db-2.0.6\setup.cfg
> Running ibm_db-2.0.6\setup.py -q bdist_egg --dist-dir 
> C:\Users\shmuglak\AppData\Local\Temp\easy_install-yijpttuo\ibm_db-
> 2.0.6\egg-dist-tmp-np3lk_6i
> Detected 32-bit Python
> C:\Users\shmuglak\AppData\Local\Temp\easy_install-yijpttuo\ibm_db-2.0.6\setup.py:204:
>  UserWarning: Detected usage of IBM
>  Data Server Driver package. Ensure you have downloaded 32-bit package of 
> IBM_Data_Server_Driver and retry the ibm_db module install
>
>   warnings.warn(notifyString)
> warning: no files found matching 'README'
> warning: no files found matching '*' under directory 'clidriver'
> warning: no files found matching '*' under directory 'ibm_db_dlls'
> error: Setup script exited with error: Unable to find vcvarsall.bat

Hmm. I don't understand the warning there, but the final error is
pretty straight-forward: you cannot build a package from source
without having Visual Studio installed. Sadly, my usual go-to source
for Windows wheels doesn't have ibm_db:

http://www.lfd.uci.edu/~gohlke/pythonlibs/

but that's definitely the first place to look.

You basically have two choices:

1) Install the zero-dollar edition of Visual Studio 2015 (I think it's
available from https://www.visualstudio.com/ but I've never actually
used it, so don't depend on my advice), and build the extension from
source. You may find Steve Dower's blog helpful here:
http://stevedower.id.au/blog/building-for-python-3-5/

2) Find someone who has done the above, or is willing to do it for
you. Get him/her to build you a wheel file (.whl), which will then be
able to be installed on your computer without needing Visual Studio.

You may be able to contact the author(s) of the ibm_db package and ask
for a wheel. Or there might be someone here who'll do it for you. Be
aware, though, that you have to trust the wheel builder completely;
you'll be taking compiled binary code and installing it on your
computer. I'm sure most people in the world are trustworthy, but there
are those few who aren't... But you already know that, I have no
doubt.

Building it yourself is a lot more work, especially as a one-off, but
once you get yourself set up to build from source, you should be able
to build anything else you need without too much more effort.

All the best!

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Pure Python routine for calculation of CRC of arbitrary length

2016-03-10 Thread wzab01
Hi,

In my FPGA related work, I had to generate and verify different non-standard 
CRC sums in a Python based testbench.
I was able to find a few CRC implementations for Python (e.g. crcmod or PyCRC), 
but none of them was able to work with arbitrary CRC polynomial and data of 
arbitrary length. Therefore I decided to "reinvent the wheel", and implement 
something in pure Python from scratch.
The implementation is not aimed on performance, but rather on simplicity.
Therefore, it computes CRC in a loop, not using the table based approach.
The data and the CRC values are represented by the bitarrays (using the 
bitarray package).

The usage is very simple:

import bitarray
from CRC_ba import CRC

d1=bitarray.bitarray('11010011101110')
d2=bitarray.bitarray('110100101110')

#Calculate the CRC
f=CRC(0b1011,3)
f.update(d1)
f.update(d2)
cr=f.checkCRC() #Without argument, it calculates CRC 
print cr 

#Check the CRC
f=CRC(0b1011,3)
f.update(d1)
f.update(d2)
chk=f.checkCRC(cr)
print chk #We should get zeroes only here

the CRC object constructor accepts the CRC polynomial (as an integer value), 
the CRC length, and (optionally) the information whether data are transmitted 
MSB first (the default or 0 value), or LSB first (the 1 value).

The code is not of the highest quality, but it saved me a lot of work, so I 
published it as PUBLIC DOMAIN in hope that it may be useful for others.
It is available in alt.sources group as "Python routine for CRC of arbitrary 
length - bitarray based version" or directly at 
https://groups.google.com/d/msg/alt.sources/dBNqgU1rFYc/A32HmbL9GgAJ

With best regards,
Wojtek
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: pip install failure for cryptography, gnureadline

2016-03-10 Thread dieter
Pietro Paolini  writes:
> ...
> I am not really familiar with the Py subsystem, even though I have got some
> guidance from some colleague, I am getting stuck when installing a list of
> packages contained in a file, running such command :
>
>
> pip install -r /home/pietro/projects/cloud-provisioning/requirements.txt
>
> Brings me :
>
>
> Collecting docutils>=0.10 (from botocore<1.4.0,>=1.3.0->boto3==1.2.3->-r
> /home/pietro/projects/cloud-provisioning/requirements.txt (line 9))
> Building wheels for collected packages: cryptography, gnureadline
>   Running setup.py bdist_wheel for cryptography: started
>   Running setup.py bdist_wheel for cryptography: finished with status
> 'error'
>   Complete output from command
> 
> src/cryptography/hazmat/bindings/__pycache__/_Cryptography_cffi_d5a71fe5xf53f5318.c:1944:15:
> error: ‘SSLv3_method’ redeclared as different kind of symbol
>SSL_METHOD* (*SSLv3_method)(void) = NULL;
>  ^
>   In file included from
> src/cryptography/hazmat/bindings/__pycache__/_Cryptography_cffi_d5a71fe5xf53f5318.c:294:0:
>   /usr/include/openssl/ssl.h:1892:19: note: previous declaration of
> ‘SSLv3_method’ was here
>const SSL_METHOD *SSLv3_method(void);  /* SSLv3 */
>  ^

This indicates that the "cryptography" version does not fit
with your "openssl" version: "openssl" defines "SSLv3_method"
as a function returning a constant pointer to an "SSL_METHOD",
while "cryptography" defines it as a pointer to such a function.

I would contact the "cryptography" author.

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Improving performance in matrix operations

2016-03-10 Thread Steven D'Aprano
On Thursday 10 March 2016 07:09, Drimades wrote:

> I'm doing some tests with operations on numpy matrices in Python. As an
> example, it takes about 3000 seconds to compute eigenvalues and
> eigenvectors using scipy.linalg.eig(a) for a matrix 6000x6000. Is it an
> acceptable time? 

I don't know what counts as acceptable. Do you have a thousand of these 
systems to solve by next Tuesday? Or one a month? Can you adjust your 
workflow to start the calculation and then go off to lunch, or do you 
require interactive use?


> Any suggestions to improve? 

Use smaller matrices? :-) Use a faster computer?


This may give you some ideas:

https://www.ibm.com/developerworks/community/blogs/jfp/entry/A_Comparison_Of_C_Julia_Python_Numba_Cython_Scipy_and_BLAS_on_LU_Factorization?lang=en



> Does C++ perform better with matrices? 

Specifically on your computer? I don't know, you'll have to try it. The 
actual time taken by a program will depend on the hardware you run it on, 
not just the language it is written in.


> Another thing to consider is that matrices I'm processing are
> heavily sparse. Do they implement any parallelism? While my code is
> running, one of my cores is 100% busy, the other one 30% busy.

You might get better answers for technical questions like that from 
dedicated numpy and scipy mailing lists.



-- 
Steve

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Installing ibm_db package on Windows 7, 64-bit problem

2016-03-10 Thread Mark Lawrence

On 10/03/2016 08:09, Chris Angelico wrote:

On Thu, Mar 10, 2016 at 6:46 PM, Alexander Shmugliakov
 wrote:

Hmm. I don't understand the warning there, but the final error is
pretty straight-forward: you cannot build a package from source
without having Visual Studio installed. Sadly, my usual go-to source
for Windows wheels doesn't have ibm_db:

http://www.lfd.uci.edu/~gohlke/pythonlibs/

but that's definitely the first place to look.

You basically have two choices:

1) Install the zero-dollar edition of Visual Studio 2015 (I think it's
available from https://www.visualstudio.com/ but I've never actually
used it, so don't depend on my advice), and build the extension from
source. You may find Steve Dower's blog helpful here:
http://stevedower.id.au/blog/building-for-python-3-5/



I've used it repeatedly with no problems, other than the fact that it 
takes hours to download and install, even with a high speed broadband 
link.  Steve Dower is attempting to get his Microsoft colleagues to 
produce a bare minimum that could be used to build Python.


--
My fellow Pythonistas, ask not what our language can do for you, ask
what you can do for our language.

Mark Lawrence

--
https://mail.python.org/mailman/listinfo/python-list


Read and count

2016-03-10 Thread Val Krem via Python-list
Hi all,

I am a new learner about python (moving from R to python) and trying  read and 
count the number of observation  by year for each city.


The data set look like
city year  x 

XC1 2001  10
XC1   2001  20
XC1   2002   20
XC1   2002   10
XC1 2002   10

Yv2 2001   10
Yv2 2002   20
Yv2 2002   20
Yv2 2002   10
Yv2 2002   10

out put will be

city
xc1  2001  2
xc1   2002  3
yv1  2001  1
yv2  2002  3


Below is my starting code
count=0
fo=open("dat", "r+")
str = fo.read();
print "Read String is : ", str

fo.close()


Many thanks
-- 
https://mail.python.org/mailman/listinfo/python-list


Simple exercise

2016-03-10 Thread Rodrick Brown
>From the following input

9
BANANA FRIES 12
POTATO CHIPS 30
APPLE JUICE 10
CANDY 5
APPLE JUICE 10
CANDY 5
CANDY 5
CANDY 5
POTATO CHIPS 30

I'm expecting the following output
BANANA FRIES 12
POTATO CHIPS 60
APPLE JUICE 20
CANDY 20

However my code seems be returning incorrect value

#!/usr/bin/env python3

import sys
import re
from collections import OrderedDict

if __name__ == '__main__':

  od = OrderedDict()
  recs = int(input())

  for _ in range(recs):
file_input = sys.stdin.readline().strip()
m = re.search(r"(\w.+)\s+(\d+)", file_input)

if m:
  if m.group(1) not in od.keys():
od[m.group(1)] = int(m.group(2))
  else:
od[m.group(1)] += int(od.get(m.group(1),0))
  for k,v in od.items():
print(k,v)

What's really going on here?

$ cat groceries.txt | ./groceries.py
BANANA FRIES 12
POTATO CHIPS 60
APPLE JUICE 20
CANDY 40
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Read and count

2016-03-10 Thread Jussi Piitulainen
Val Krem writes:

> Hi all,
>
> I am a new learner about python (moving from R to python) and trying
> read and count the number of observation by year for each city.
>
>
> The data set look like
> city year  x 
>
> XC1 2001  10
> XC1   2001  20
> XC1   2002   20
> XC1   2002   10
> XC1 2002   10
>
> Yv2 2001   10
> Yv2 2002   20
> Yv2 2002   20
> Yv2 2002   10
> Yv2 2002   10
>
> out put will be
>
> city
> xc1  2001  2
> xc1   2002  3
> yv1  2001  1
> yv2  2002  3
>
>
> Below is my starting code
> count=0
> fo=open("dat", "r+")
> str = fo.read();
> print "Read String is : ", str
>
> fo.close()

Below's some of the basics that you want to study. Also look up the csv
module in Python's standard library. You will want to learn these things
even if you end up using some sort of third-party data-frame library (I
don't know those but they exist).

from collections import Counter

# collections.Counter is a special dictionary type for just this
counts = Counter()

# with statement ensures closing the file
with open("dat") as fo:
# file object provides lines
next(fo) # skip header line
for line in fo:
# test requires non-empty string, but lines
# contain at least newline character so ok
if line.isspace(): continue
# .split() at whitespace, omits empty fields
city, year, x = line.split()
# collections.Counter has default 0,
# key is a tuple (city, year), parentheses omitted here
counts[city, year] += 1

print("city")
for city, year in sorted(counts): # iterate over keys
print(city.lower(), year, counts[city, year], sep = "\t")

# Alternatively:
# for cy, n in sorted(counts.items()):
#   city, year = cy
#   print(city.lower(), year, n, sep = "\t")
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Read and count

2016-03-10 Thread Mark Lawrence

Hello and welcome.

Please see my comments below.

On 09/03/2016 21:30, Val Krem via Python-list wrote:

Hi all,

I am a new learner about python (moving from R to python) and trying  read and 
count the number of observation  by year for each city.

The data set look like
city year  x

XC1 2001  10
XC1   2001  20
XC1   2002   20
XC1   2002   10
XC1 2002   10

Yv2 2001   10
Yv2 2002   20
Yv2 2002   20
Yv2 2002   10
Yv2 2002   10

out put will be

city
xc1  2001  2
xc1   2002  3
yv1  2001  1
yv2  2002  3


Below is my starting code
count=0


Seems like you'd want a counter here 
https://docs.python.org/3/library/collections.html#collections.Counter. 
 You'll need to know how to import this so start here 
https://docs.python.org/3/tutorial/modules.html



fo=open("dat", "r+")


We'd normally use the 'with' keyword here so the file automatically gets 
closed so:-


with open("dat", "r+") as fo:


str = fo.read();


'str' isn't a good name as it overrides the builtin function of that 
name.  This will read the entire file.  Easiest to loop as in:-


for line in fo.readlines():

Now you'll need a split call to get at your data 
https://docs.python.org/3/library/stdtypes.html#str.split and update 
your counter.  Once this loop is finished use another loop to produce 
your output with print.



print "Read String is : ", str


The above is for Python 2, it needs parenthesis for Python 3.  I'd 
recommend starting with the latter if that's possible.




fo.close()


Not needed if you use the 'with' keyword as discussed above.



Many thanks



No problem as I'm leaving you to put it all together :)

--
My fellow Pythonistas, ask not what our language can do for you, ask
what you can do for our language.

Mark Lawrence

--
https://mail.python.org/mailman/listinfo/python-list


Re: Read and count

2016-03-10 Thread Peter Otten
Jussi Piitulainen wrote:

> Val Krem writes:
> 
>> Hi all,
>>
>> I am a new learner about python (moving from R to python) and trying
>> read and count the number of observation by year for each city.
>>
>>
>> The data set look like
>> city year  x
>>
>> XC1 2001  10
>> XC1   2001  20
>> XC1   2002   20
>> XC1   2002   10
>> XC1 2002   10
>>
>> Yv2 2001   10
>> Yv2 2002   20
>> Yv2 2002   20
>> Yv2 2002   10
>> Yv2 2002   10
>>
>> out put will be
>>
>> city
>> xc1  2001  2
>> xc1   2002  3
>> yv1  2001  1
>> yv2  2002  3
>>
>>
>> Below is my starting code
>> count=0
>> fo=open("dat", "r+")
>> str = fo.read();
>> print "Read String is : ", str
>>
>> fo.close()
> 
> Below's some of the basics that you want to study. Also look up the csv
> module in Python's standard library. You will want to learn these things
> even if you end up using some sort of third-party data-frame library (I
> don't know those but they exist).

With pandas:
 
$ cat sample.txt
city year  x 
XC1 2001  10
XC1   2001  20
XC1   2002   20
XC1   2002   10
XC1 2002   10
Yv2 2001   10
Yv2 2002   20
Yv2 2002   20
Yv2 2002   10
Yv2 2002   10
$ python3
Python 3.4.3 (default, Oct 14 2015, 20:28:29) 
[GCC 4.8.4] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import pandas
>>> table = pandas.read_csv("sample.txt", delimiter=r"\s+")
>>> table
  city  year   x
0  XC1  2001  10
1  XC1  2001  20
2  XC1  2002  20
3  XC1  2002  10
4  XC1  2002  10
5  Yv2  2001  10
6  Yv2  2002  20
7  Yv2  2002  20
8  Yv2  2002  10
9  Yv2  2002  10

[10 rows x 3 columns]
>>> table.groupby(["city", "year"])["x"].count()
city  year
XC1   20012
  20023
Yv2   20011
  20024
dtype: int64


> from collections import Counter
> 
> # collections.Counter is a special dictionary type for just this
> counts = Counter()
> 
> # with statement ensures closing the file
> with open("dat") as fo:
> # file object provides lines
> next(fo) # skip header line
> for line in fo:
> # test requires non-empty string, but lines
> # contain at least newline character so ok
> if line.isspace(): continue
> # .split() at whitespace, omits empty fields
> city, year, x = line.split()
> # collections.Counter has default 0,
> # key is a tuple (city, year), parentheses omitted here
> counts[city, year] += 1
> 
> print("city")
> for city, year in sorted(counts): # iterate over keys
> print(city.lower(), year, counts[city, year], sep = "\t")
> 
> # Alternatively:
> # for cy, n in sorted(counts.items()):
> #   city, year = cy
> #   print(city.lower(), year, n, sep = "\t")


-- 
https://mail.python.org/mailman/listinfo/python-list


RE: Read and count

2016-03-10 Thread Joaquin Alzola
Try to do a .split(' ') and then add field 0 and 1 to an list.

For open the file you can do it easier:

with open(file) as f:
for line in f:
print('line')

-Original Message-
From: Python-list 
[mailto:python-list-bounces+joaquin.alzola=lebara@python.org] On Behalf Of 
Val Krem via Python-list
Sent: 09 March 2016 21:31
To: python-list@python.org
Subject: Read and count

Hi all,

I am a new learner about python (moving from R to python) and trying  read and 
count the number of observation  by year for each city.


The data set look like
city year  x

XC1 2001  10
XC1   2001  20
XC1   2002   20
XC1   2002   10
XC1 2002   10

Yv2 2001   10
Yv2 2002   20
Yv2 2002   20
Yv2 2002   10
Yv2 2002   10

out put will be

city
xc1  2001  2
xc1   2002  3
yv1  2001  1
yv2  2002  3


Below is my starting code
count=0
fo=open("dat", "r+")
str = fo.read();
print "Read String is : ", str

fo.close()


Many thanks
--
https://mail.python.org/mailman/listinfo/python-list
This email is confidential and may be subject to privilege. If you are not the 
intended recipient, please do not copy or disclose its content but contact the 
sender immediately upon receipt.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Installing ibm_db package on Windows 7, 64-bit problem

2016-03-10 Thread Alexander Shmugliakov
On Thursday, March 10, 2016 at 10:09:26 AM UTC+2, Chris Angelico wrote:
> On Thu, Mar 10, 2016 at 6:46 PM, Alexander Shmugliakov
>  wrote:
> > Hello, has anybody successfully installed ibm_db package in the 32-bit 
> > Python 3.5.0 environment? This is the error messages I'm receiving:
> > I know that the message about vcvarsall.bat is related to the Visual Studio 
> > configuration -- I don't have it on my computer. I have no 64-bit IBM Data 
> > Server Driver installed on my machine. Any help will be *greatly* 
> > appreciated.
> >
> > '--'
> > C:\Program Files (x86)\Python 3.5>easy_install ibm_db
> > Searching for ibm-db
> > Reading https://pypi.python.org/simple/ibm_db/
> > Best match: ibm-db 2.0.6
> > Downloading 
> > https://pypi.python.org/packages/source/i/ibm_db/ibm_db-2.0.6.tar.gz#md5=f6b80e1489167a141ebf8978c37ca398
> > Processing ibm_db-2.0.6.tar.gz
> > Writing 
> > C:\Users\shmuglak\AppData\Local\Temp\easy_install-yijpttuo\ibm_db-2.0.6\setup.cfg
> > Running ibm_db-2.0.6\setup.py -q bdist_egg --dist-dir 
> > C:\Users\shmuglak\AppData\Local\Temp\easy_install-yijpttuo\ibm_db-
> > 2.0.6\egg-dist-tmp-np3lk_6i
> > Detected 32-bit Python
> > C:\Users\shmuglak\AppData\Local\Temp\easy_install-yijpttuo\ibm_db-2.0.6\setup.py:204:
> >  UserWarning: Detected usage of IBM
> >  Data Server Driver package. Ensure you have downloaded 32-bit package of 
> > IBM_Data_Server_Driver and retry the ibm_db module install
> >
> >   warnings.warn(notifyString)
> > warning: no files found matching 'README'
> > warning: no files found matching '*' under directory 'clidriver'
> > warning: no files found matching '*' under directory 'ibm_db_dlls'
> > error: Setup script exited with error: Unable to find vcvarsall.bat
> 
> Hmm. I don't understand the warning there, but the final error is
> pretty straight-forward: you cannot build a package from source
> without having Visual Studio installed. Sadly, my usual go-to source
> for Windows wheels doesn't have ibm_db:
> 
> http://www.lfd.uci.edu/~gohlke/pythonlibs/
> 
> but that's definitely the first place to look.
> 
> You basically have two choices:
> 
> 1) Install the zero-dollar edition of Visual Studio 2015 (I think it's
> available from https://www.visualstudio.com/ but I've never actually
> used it, so don't depend on my advice), and build the extension from
> source. You may find Steve Dower's blog helpful here:
> http://stevedower.id.au/blog/building-for-python-3-5/
> 
> 2) Find someone who has done the above, or is willing to do it for
> you. Get him/her to build you a wheel file (.whl), which will then be
> able to be installed on your computer without needing Visual Studio.
> 
> You may be able to contact the author(s) of the ibm_db package and ask
> for a wheel. Or there might be someone here who'll do it for you. Be
> aware, though, that you have to trust the wheel builder completely;
> you'll be taking compiled binary code and installing it on your
> computer. I'm sure most people in the world are trustworthy, but there
> are those few who aren't... But you already know that, I have no
> doubt.
> 
> Building it yourself is a lot more work, especially as a one-off, but
> once you get yourself set up to build from source, you should be able
> to build anything else you need without too much more effort.
> 
> All the best!
> 
> ChrisA

Thank you Chris! Actually I have received your response while in the process 
(quite a lengthy one for some reason) of the VS Community Edition installation. 
Will see if it will solve my problem (or at least brings me the missing dll). I 
appreciate your immediate response anyway.

Best, 
Alexander
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Simple exercise

2016-03-10 Thread Erik

Hi.

This looks a little like a homework problem, so I'll be cryptic ... :)

On 10/03/16 09:02, Rodrick Brown wrote:

From the following input


9
BANANA FRIES 12
POTATO CHIPS 30
APPLE JUICE 10
CANDY 5
APPLE JUICE 10
CANDY 5
CANDY 5
CANDY 5
POTATO CHIPS 30





 if m:
   if m.group(1) not in od.keys():
 od[m.group(1)] = int(m.group(2))
   else:
 od[m.group(1)] += int(od.get(m.group(1),0))


Look very carefully at the last line above. What is it _actually_ doing?

Also, consider that for each of the input lines all of the items that 
occur more than once exactly double the current value each time they are 
repeated. Except "CANDY", which is the item you are having a problem 
with ... that may tell you something.


E.
--
https://mail.python.org/mailman/listinfo/python-list


Re: Simple exercise

2016-03-10 Thread Peter Otten
Rodrick Brown wrote:

> From the following input
> 
> 9
> BANANA FRIES 12
> POTATO CHIPS 30
> APPLE JUICE 10
> CANDY 5
> APPLE JUICE 10
> CANDY 5
> CANDY 5
> CANDY 5
> POTATO CHIPS 30
> 
> I'm expecting the following output
> BANANA FRIES 12
> POTATO CHIPS 60
> APPLE JUICE 20
> CANDY 20
> 
> However my code seems be returning incorrect value
> 
> #!/usr/bin/env python3
> 
> import sys
> import re
> from collections import OrderedDict
> 
> if __name__ == '__main__':
> 
>   od = OrderedDict()
>   recs = int(input())
> 
>   for _ in range(recs):
> file_input = sys.stdin.readline().strip()
> m = re.search(r"(\w.+)\s+(\d+)", file_input)
> 
> if m:
>   if m.group(1) not in od.keys():
> od[m.group(1)] = int(m.group(2))
>   else:
> od[m.group(1)] += int(od.get(m.group(1),0))

Look closely at the line above. 

What value do you want to add to the current sum?
What value are you actually providing on the right side?

>   for k,v in od.items():
> print(k,v)
> 
> What's really going on here?
> 
> $ cat groceries.txt | ./groceries.py
> BANANA FRIES 12
> POTATO CHIPS 60
> APPLE JUICE 20
> CANDY 40


-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Simple exercise

2016-03-10 Thread Thomas 'PointedEars' Lahn
Rodrick Brown wrote:

> […]
> if m:
>   if m.group(1) not in od.keys():
> od[m.group(1)] = int(m.group(2))
>   else:
> od[m.group(1)] += int(od.get(m.group(1),0))
> […]

This program logic appears to be wrong as you are not adding the value that 
you just read to the dictionary entry for the key that you just read but the 
value that you had in the dictionary for that key before.  Perhaps you were 
looking for this (I also optimized a bit):

key = m.group(1)
value = int(m.group(1))

if key not in od:
  od[key] = value
else:
  od[key] += value

But there is probably an even more pythonic way to do this.

-- 
PointedEars

Twitter: @PointedEars2
Please do not cc me. / Bitte keine Kopien per E-Mail.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Text input with keyboard, via input methods

2016-03-10 Thread Marko Rauhamaa
Ben Finney :

> As for how solved it is, that depends on what you're hoping for as a
> solution.
>
> [...]
>
> Hopefully your operating system has a good input method system, with
> many input methods available to choose from. May you find a decent
> default there.

I don't have an answer. I have requirements, though:

 * I should be able to get the character by knowing its glyph (shape).

 * It should be very low-level and work system-wide, preferably over the
   network (I'm typing this over the network).

The solution may require a touch screen and a canvas where I can draw
with my fingers.

The solution may have to be implemented in the keyboard.

Or maybe we'll have to wait for brain-implantable bluetooth tranceivers.
Then, we'd just think of the character and it would appear on the
screen.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Simple exercise

2016-03-10 Thread Thomas 'PointedEars' Lahn
Thomas 'PointedEars' Lahn wrote:

> key = m.group(1)
> value = int(m.group(1))

value = int(m.group(2))

 
> if key not in od:
>   od[key] = value
> else:
>   od[key] += value

-- 
PointedEars

Twitter: @PointedEars2
Please do not cc me. / Bitte keine Kopien per E-Mail.
-- 
https://mail.python.org/mailman/listinfo/python-list


looping and searching in numpy array

2016-03-10 Thread Heli
Dear all, 

I need to loop over a numpy array and then do the following search. The 
following is taking almost 60(s) for an array (npArray1 and npArray2 in the 
example below) with around 300K values. 


for id in np.nditer(npArray1):
  
   newId=(np.where(npArray2==id))[0][0]


Is there anyway I can make the above faster? I need to run the script above on 
much bigger arrays (50M). Please note that my two numpy arrays in the lines 
above, npArray1 and npArray2  are not necessarily the same size, but they are 
both 1d. 


Thanks a lot for your help, 

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Question

2016-03-10 Thread Jon Ribbens
On 2016-03-10, Dennis Lee Bieber  wrote:
> On Wed, 9 Mar 2016 12:28:30 - (UTC), Jon Ribbens
> declaimed the following:
>>On 2016-03-09, Ian Kelly  wrote:
>>> It looks like the shell environment that comes with Git for Windows is
>>> actually Windows Powershell [1], so presumably the activate.ps1 script
>>> that's already provided by venv is what's needed, not a bash script.
>>
>>This is not true. I installed Git for Windows and what it gave me was
>>"Git Bash" which as you say runs in a window titled "MINGW64".
>>
>>If I try to run the activate.ps1 script it says:
>>
>>$ env/Scripts/activate.ps1
>>env/Scripts/activate.ps1: line 1: syntax error near unexpected token 
>>`[switch]$NonDestructive'
>>env/Scripts/activate.ps1: line 1: `function global:deactivate 
>>([switch]$NonDestructive) {'
>
>   Which is no surprise, as .ps1 is a PowerShell script

Indeed, I was just pointing out that the shell Git for Windows
installed was definitely not Powershell (or at best, Powershell
is only one of the available options).

> http://stackoverflow.com/questions/30577271/activating-pyvenv-from-gitbash-for-windows

Yes, I tried the solution suggested there (copy the 'activate' script
from an existing Unix virtualenv) and unfortunately it didn't work
(it looked like it worked but 'pip install' still installed things in
the system packages directory not the virtualenv).
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Pyhon 2.x or 3.x, which is faster?

2016-03-10 Thread BartC

On 10/03/2016 07:30, Mark Lawrence wrote:

On 10/03/2016 00:58, BartC wrote:



You think a bloody great compiler is a microbenchmark?!


I have no interest in the speed of the compiler, I am interested in the
run time speed of the applications that it produces which is what has
been discussed thus far.


Perhaps you missed the fact that this compiler is written in the very 
language it compiles. The code generated /is/ the application. And 
compilers are real applications that people use all the time.



A compiler is another good 'pure language' task because, apart from
input and output at each end, all the computation is self-contained.)


I've no idea what this is meant to mean.


It means the task doesn't do any function calls to external libraries.


Meaning that in this dreadful place called the real world it's less than
useless in many cases.


Suppose you were on the development team that writes the optimising 
stages of a C compiler. You need to test the performance of the code it 
produces so that you can compare one optimisation with another. Would you:


(a) Test only the code that is generated by your compiler

(b) Include also the runtime of third-party libraries consisting of 
unknown code, written in an unknown language, with an unknown compiler 
and with unknown optimisation settings?


You seem to be suggesting that (b) is a valid way of measuring the 
performance of a language.



Yes I am, as you appear to know squat.


I don't think I've ever traded insults on usenet or any public forum. 
I'm too nice a chap. But today it's rather tempting!



But, yeah, I was writing international applications decades ago. I'm not
working for anyone now and don't need to bother.


So your new language doesn't bother with unicode then?


Yes, it has provision for it. But I've not got round to implementing it. 
Other things have more priority. Or are more interesting. As I said, 
I've had all that fun before.


If I desperately needed to use Unicode today, then something can be 
arranged either with the language or around it. It's not a big deal.


--
Bartc
--
https://mail.python.org/mailman/listinfo/python-list


Re: Text input with keyboard, via input methods

2016-03-10 Thread Rustom Mody
On Thursday, March 10, 2016 at 4:21:15 PM UTC+5:30, Marko Rauhamaa wrote:
> Ben Finney :
> 
> > As for how solved it is, that depends on what you're hoping for as a
> > solution.
> >
> > [...]
> >
> > Hopefully your operating system has a good input method system, with
> > many input methods available to choose from. May you find a decent
> > default there.
> 
> I don't have an answer. I have requirements, though:
> 
>  * I should be able to get the character by knowing its glyph (shape).
> 
>  * It should be very low-level and work system-wide, preferably over the
>network (I'm typing this over the network).
> 
> The solution may require a touch screen and a canvas where I can draw
> with my fingers.
> 
> The solution may have to be implemented in the keyboard.
> 
> Or maybe we'll have to wait for brain-implantable bluetooth tranceivers.
> Then, we'd just think of the character and it would appear on the
> screen.

Lets say you wrote/participated in a million-line C/C++ codebase.
And lets say you would/could redo it in super-duper language 'L' (could but 
need not be python)
Even if you believe in a 1000-fold improvement going C → L, you'd still need
to input a 1000 lines of L-code.

How would you do it? With character recognition?

OTOH…

I am ready to bet that on your keyboard maybe US-104, maybe something more
exotic:

- There is a key that is marked something that looks like 'A'
- Pounding that 'A' produces something that looks like 'a'
' And to get a 'A' from the 'A' you need to do a SHIFT-A

IOW its easy to forget that typing ASCII on a us-104 still needs input-methods

Its just that these need to become more reified/firstclass going from the
<100 chars of ASCII to the million+ of unicode.

Or if I may invoke programmer-intuition:

a. If one had to store a dozen values, a dozen variables would be ok
b. For a thousand, we'd like a -- maybe simple -- datastructure like an 
array/dict
c. For a million (or billion) the data structure would need to be sophisticated

The problem with unicode is not that 10 is a large number but that
we are applying a-paradigm to c-needs.

Some of my --admittedly faltering -- attempts to correct this:

http://blog.languager.org/2015/01/unicode-and-universe.html
http://blog.languager.org/2015/03/whimsical-unicode.html
http://blog.languager.org/2015/02/universal-unicode.html
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Pyhon 2.x or 3.x, which is faster?

2016-03-10 Thread Mark Lawrence

On 10/03/2016 11:50, BartC wrote:

On 10/03/2016 07:30, Mark Lawrence wrote:

On 10/03/2016 00:58, BartC wrote:



You think a bloody great compiler is a microbenchmark?!


I have no interest in the speed of the compiler, I am interested in the
run time speed of the applications that it produces which is what has
been discussed thus far.


Perhaps you missed the fact that this compiler is written in the very
language it compiles. The code generated /is/ the application. And
compilers are real applications that people use all the time.


A compiler is another good 'pure language' task because, apart from
input and output at each end, all the computation is self-contained.)


I've no idea what this is meant to mean.


It means the task doesn't do any function calls to external libraries.


Meaning that in this dreadful place called the real world it's less than
useless in many cases.


Suppose you were on the development team that writes the optimising
stages of a C compiler. You need to test the performance of the code it
produces so that you can compare one optimisation with another. Would you:

(a) Test only the code that is generated by your compiler

(b) Include also the runtime of third-party libraries consisting of
unknown code, written in an unknown language, with an unknown compiler
and with unknown optimisation settings?


What has an optimising C compiler got to do with the run time speed of 
Python, which in many cases is perfectly adequate? I'll repeat for 
possibly the fourth time, the vast majority of people have no interest 
in run time speed as they are fully aware that they'll be wasting their 
precious development time, as they know that their code will be waiting 
on the file, the database or the network.  What have you failed to grasp 
about that?




You seem to be suggesting that (b) is a valid way of measuring the
performance of a language.


Yes I am, as you appear to know squat.


I don't think I've ever traded insults on usenet or any public forum.
I'm too nice a chap. But today it's rather tempting!


But, yeah, I was writing international applications decades ago. I'm not
working for anyone now and don't need to bother.


So your new language doesn't bother with unicode then?


Yes, it has provision for it. But I've not got round to implementing it.
Other things have more priority. Or are more interesting. As I said,
I've had all that fun before.

If I desperately needed to use Unicode today, then something can be
arranged either with the language or around it. It's not a big deal.



Laughable.  It's 2016, but "then something can be arranged either with 
the language or around it".  It's not what you personally want, it's 
what the entire world wants.  If you think that your language is going 
to take over the world without unicode support, just because it's faster 
than Python, I seriously suggest that you see a trick cyclist, and 
rather quickly.


--
My fellow Pythonistas, ask not what our language can do for you, ask
what you can do for our language.

Mark Lawrence

--
https://mail.python.org/mailman/listinfo/python-list


Re: Pyhon 2.x or 3.x, which is faster?

2016-03-10 Thread BartC

On 10/03/2016 01:59, Steven D'Aprano wrote:


I think Bart is very old-school, and probably a bit behind the times when it
comes to modern compiler and interpreter technologies.


That's true. I've reached a dead-end with what I can do with 
interpreted, dynamically typed byte-code, but it stills holds its own 
compared with other non-accelerated scripting languages, even PyPy 
sometimes.


(Although other JIT projects I think are faster than PyPy, eg. LuaJIT. 
Very compact too.)


(I could also go the JIT route, but it's very complicated and not much 
fun! And once you start generating custom native code, then you're 
competing with proper compilers.)


 But that doesn't

matter: the old timers knew a thing or two, and in some ways the old days
were better:

http://prog21.dadgum.com/116.html


I fear that Bart still holds quite a few misapprehensions about Python. But
he seems happy to discuss the language


I have an interest in C and in Python because those are probably the two 
languages I'd be using now, if I hadn't been spoilt by having to create 
my own versions in the 1980s.


I've watched Python's development with interest because there were some 
parallels with the script language I was using for my applications (I 
decided my language needed byte-arrays bolted on; Python also added 
byte-arrays!)


Python however decided to be far more dynamic. (Making efficient 
interpreters a bit harder to write.)


--
Bartc
--
https://mail.python.org/mailman/listinfo/python-list


How to program round this poplib error?

2016-03-10 Thread cl
I have a (fairly simple) Python program that scans through a
'catchall' E-Mail address for things that *might* be for me.  It sends
anything that could be for me to my main E-Mail and discards the rest.

However I *occasionally* get an error from it as follows:-

Traceback (most recent call last):
  File "/home/chris/.mutt/bin/getCatchall.py", line 65, in 
pop3.dele(i+1)
  File "/usr/lib/python2.7/poplib.py", line 240, in dele
return self._shortcmd('DELE %s' % which)
  File "/usr/lib/python2.7/poplib.py", line 160, in _shortcmd
return self._getresp()
  File "/usr/lib/python2.7/poplib.py", line 132, in _getresp
resp, o = self._getline()
  File "/usr/lib/python2.7/poplib.py", line 377, in _getline
raise error_proto('line too long')
poplib.error_proto: line too long


Does anyone have any idea how I can program around this somehow?  As
it is at the moment I have to go to the webmail system at my ISP and
manually delete the message which is a bit of a nuisance.

The piece of code that produces the error is as follows:-

# 
# 
# Read each message into a string and then parse with the email module, 
if 
# there's an error retrieving the message then just throw it away 
# 
try:
popmsg = pop3.retr(i+1)
except:
pop3.dele(i+1)
continue

The trouble is that the error is (presumably) some sort of buffer size
limitation in pop3.dele().  If I trap the error then I still can't get
rid of the rogue E-Mail and, more to the point, I can't even identify
it so that the trap could report the error and tell me which message
was causing it.

I guess one way to get around the problem would be to increase
_MAXLINE in /usr/lib/python2.7/poplib.py, it's on my own system so I
could do this.  Can anyone think of a better approach?



-- 
Chris Green
·
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: How to program round this poplib error?

2016-03-10 Thread Mark Lawrence

On 10/03/2016 12:04, c...@isbd.net wrote:

I have a (fairly simple) Python program that scans through a
'catchall' E-Mail address for things that *might* be for me.  It sends
anything that could be for me to my main E-Mail and discards the rest.

However I *occasionally* get an error from it as follows:-

 Traceback (most recent call last):
   File "/home/chris/.mutt/bin/getCatchall.py", line 65, in 
 pop3.dele(i+1)
   File "/usr/lib/python2.7/poplib.py", line 240, in dele
 return self._shortcmd('DELE %s' % which)
   File "/usr/lib/python2.7/poplib.py", line 160, in _shortcmd
 return self._getresp()
   File "/usr/lib/python2.7/poplib.py", line 132, in _getresp
 resp, o = self._getline()
   File "/usr/lib/python2.7/poplib.py", line 377, in _getline
 raise error_proto('line too long')
 poplib.error_proto: line too long


Does anyone have any idea how I can program around this somehow?  As
it is at the moment I have to go to the webmail system at my ISP and
manually delete the message which is a bit of a nuisance.



How about a try/except in your code that catches poplib.error_proto?

--
My fellow Pythonistas, ask not what our language can do for you, ask
what you can do for our language.

Mark Lawrence

--
https://mail.python.org/mailman/listinfo/python-list


Re: Pyhon 2.x or 3.x, which is faster?

2016-03-10 Thread BartC

On 10/03/2016 12:15, Mark Lawrence wrote:

On 10/03/2016 11:50, BartC wrote:



Suppose you were on the development team that writes the optimising
stages of a C compiler. You need to test the performance of the code it
produces so that you can compare one optimisation with another. Would
you:

(a) Test only the code that is generated by your compiler

(b) Include also the runtime of third-party libraries consisting of
unknown code, written in an unknown language, with an unknown compiler
and with unknown optimisation settings?


What has an optimising C compiler got to do with the run time speed of
Python, which in many cases is perfectly adequate?



I'll repeat for
possibly the fourth time, the vast majority of people


The vast majority aren't implementing the language!


have no interest
in run time speed as they are fully aware that they'll be wasting their
precious development time, as they know that their code will be waiting
on the file, the database or the network.  What have you failed to grasp
about that?


Tell that to the people who have been working on optimising compilers 
for the last couple of decades. Why bother making that inner loop 10% 
faster, when the program will most likely be blocked waiting for input 
anyway?


You just don't get it.

(BTW next you have have a look at the CPython source code, count how 
many times the words 'fast', 'faster' and 'fastest' occur. It obviously 
was a preoccupation with the implementers. If Python is currently fast 
enough for you, then thank those people who didn't just shrug their 
shoulders and not bother!)

--
https://mail.python.org/mailman/listinfo/python-list


Re: Installing ibm_db package on Windows 7, 64-bit problem

2016-03-10 Thread Chris Angelico
On Thu, Mar 10, 2016 at 8:31 PM, Alexander Shmugliakov
 wrote:
> Thank you Chris! Actually I have received your response while in the process 
> (quite a lengthy one for some reason) of the VS Community Edition 
> installation. Will see if it will solve my problem (or at least brings me the 
> missing dll). I appreciate your immediate response anyway.
>

Great! Setting up a dev environment on Windows isn't easy (it's not
like on Linux where building things from source is the normal and
expected thing), but hopefully it'll reward the effort. And it should
be a one-off setup job; this same version of MS VS should work for
building extensions against future versions of Python, too. (This
wasn't previously the case. That's what the post I linked to on Steve
Dower's blog is all about.)

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: looping and searching in numpy array

2016-03-10 Thread Peter Otten
Heli wrote:

> Dear all,
> 
> I need to loop over a numpy array and then do the following search. The
> following is taking almost 60(s) for an array (npArray1 and npArray2 in
> the example below) with around 300K values.
> 
> 
> for id in np.nditer(npArray1):
>   
>newId=(np.where(npArray2==id))[0][0]
> 
> 
> Is there anyway I can make the above faster? I need to run the script
> above on much bigger arrays (50M). Please note that my two numpy arrays in
> the lines above, npArray1 and npArray2  are not necessarily the same size,
> but they are both 1d.

You mean you are looking for the index of the first occurence in npArray2 
for every value of npArray1?

I don't know how to do this in numpy (I'm not an expert), but even basic 
Python might be acceptable:

lookup = {}
for i, v in enumerate(npArray2):
if v not in lookup:
lookup[v] = i

for v in npArray1:
print(lookup.get(v, ""))

That way you iterate once (in Python) instead of 2*len(npArray1) times (in 
C) over npArray2.

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Pyhon 2.x or 3.x, which is faster?

2016-03-10 Thread Chris Angelico
On Thu, Mar 10, 2016 at 11:47 PM, BartC  wrote:
>> have no interest
>> in run time speed as they are fully aware that they'll be wasting their
>> precious development time, as they know that their code will be waiting
>> on the file, the database or the network.  What have you failed to grasp
>> about that?
>
>
> Tell that to the people who have been working on optimising compilers for
> the last couple of decades. Why bother making that inner loop 10% faster,
> when the program will most likely be blocked waiting for input anyway?
>
> You just don't get it.
>

Both of you "just don't get" something that the other sees as
critically important. Before this thread gets to fisticuffs, may I
please summarize the points that probably nobody will concede?

1) Unicode support, intrinsic to the language, is crucial, even if
BartC refuses to recognize this. Anything released beyond the confines
of his personal workspace will need full Unicode support, otherwise it
is a problem to the rest of the world, and should be destroyed with
fire. Thanks.

2) Interpreter performance, and the performance of code emitted by a
compiler (distinct from "compiler performance" which would be how
quickly it can compile code) makes a huge difference to real-world
applications, even if MarkL refuses to recognize this. While it
doesn't hurt the rest of the world to have a slow implementation of a
language, it does _benefit_ the world to have a _faster_
implementation, as long as that doesn't come at unnecessary costs.

3) There is a point at which performance ceases to matter for a given
application. This point varies from app to app, but generally is
reached when I/O wait time dwarfs CPU usage. A language which has
reached this point for the majority of applications can be said to be
"fast enough", not because its developers do not care about
performance (particularly regressions), and not because further
improvements are useless, but because there are other considerations
more important than pure run-time speed.

4) Burying the bulk of something away in external API calls is a great
way to make the real performance improve, but makes performance
*measurement* harder. This matters to benchmarking and to real-world
usage in different (almost, but not entirely, contradictory) ways.

When people want better performance out of a number-crunching Python
program, they have a few options. One is to rewrite their code in C or
Fortran or something. Another is to make small tweaks so the bulk of
the work is handled by numpy or Cython. A third is to keep their code
completely unchanged, but run it under PyPy instead of whatever they
were previously using (probably CPython). Generally, rewriting in
C/Fortran is generally a bad idea; you pay the price over the whole
application, when optimizing a small subset of it would give 99% of
the performance improvement. That's why actual CPython byte-code
interpretation performance isn't so critical; if we can change 5% of
the code so it uses numpy, we keep 95% of it in idiomatic Python,
while still having the bulk of the work done in Fortran. CPython has
other priorities than performance - not to say that "slow is fine",
but more that "slow and dynamic opens up possibilities that fast and
static precludes, so we're happy to pay the price for the features we
want".

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Pyhon 2.x or 3.x, which is faster?

2016-03-10 Thread Mark Lawrence

On 10/03/2016 12:47, BartC wrote:

On 10/03/2016 12:15, Mark Lawrence wrote:

On 10/03/2016 11:50, BartC wrote:



Suppose you were on the development team that writes the optimising
stages of a C compiler. You need to test the performance of the code it
produces so that you can compare one optimisation with another. Would
you:

(a) Test only the code that is generated by your compiler

(b) Include also the runtime of third-party libraries consisting of
unknown code, written in an unknown language, with an unknown compiler
and with unknown optimisation settings?


What has an optimising C compiler got to do with the run time speed of
Python, which in many cases is perfectly adequate?



I'll repeat for
possibly the fourth time, the vast majority of people


The vast majority aren't implementing the language!


No, they are complete weirdos called USERS.  Have you ever met any?




have no interest
in run time speed as they are fully aware that they'll be wasting their
precious development time, as they know that their code will be waiting
on the file, the database or the network.  What have you failed to grasp
about that?


Tell that to the people who have been working on optimising compilers
for the last couple of decades. Why bother making that inner loop 10%
faster, when the program will most likely be blocked waiting for input
anyway?

You just don't get it.


At last you manage to get something correct, I do not get your obsession 
with run time speed.  For the fifth time, the vast majority of users 
simply do not care.  If it is fast enough, job done.


I have dealt with some complete dumbos in the years that I've been 
online, but when it comes to thickos you're right up there with the RUE, 
and I can assure you that this is meant to be an insult.


To your way of thinking run time speed is the sole issue with 
programming, and trivial details like accuracy are irrelevant. "Look, 
Python has taken a whole minute to process this data, but BartC has done 
it in one nanosecond". "Yes, but Python is 100% accurate, BartC is 100% 
inaccurate". "Who cares, only speed counts".


--
My fellow Pythonistas, ask not what our language can do for you, ask
what you can do for our language.

Mark Lawrence

--
https://mail.python.org/mailman/listinfo/python-list


Re: looping and searching in numpy array

2016-03-10 Thread Mark Lawrence

On 10/03/2016 11:43, Heli wrote:

Dear all,

I need to loop over a numpy array and then do the following search. The 
following is taking almost 60(s) for an array (npArray1 and npArray2 in the 
example below) with around 300K values.


for id in np.nditer(npArray1):

newId=(np.where(npArray2==id))[0][0]


Is there anyway I can make the above faster? I need to run the script above on 
much bigger arrays (50M). Please note that my two numpy arrays in the lines 
above, npArray1 and npArray2  are not necessarily the same size, but they are 
both 1d.


Thanks a lot for your help,



I'm no numpy expert, but if you're using a loop my guess is that you're 
doing it wrong.  I suggest your first port of call is the numpy docs if 
you haven't all ready been there, then the specific numpy mailing list 
or stackoverflow, as it seems very likely that this type of question has 
been asked before.


--
My fellow Pythonistas, ask not what our language can do for you, ask
what you can do for our language.

Mark Lawrence

--
https://mail.python.org/mailman/listinfo/python-list


Re: How to program round this poplib error?

2016-03-10 Thread Jon Ribbens
On 2016-03-10, c...@isbd.net  wrote:
> # Read each message into a string and then parse with the email 
> module, if 
> # there's an error retrieving the message then just throw it away 
> # 
> try:
> popmsg = pop3.retr(i+1)
> except:
> pop3.dele(i+1)
> continue
>
> The trouble is that the error is (presumably) some sort of buffer size
> limitation in pop3.dele().  If I trap the error then I still can't get
> rid of the rogue E-Mail and, more to the point, I can't even identify
> it so that the trap could report the error and tell me which message
> was causing it.

You really, really should not be using bare "except:".
Always specify which exceptions you are trying to catch.

In this case, I think there are two problems. Firstly, I think
whoever implemented poplib mis-read the POP3 specification, as
they are applying the line-length limit to not just the POP3
commands and responses, but the email contents too.

Secondly, you are just trying to carry on with the POP3 connection
after it has thrown an exception. You can't do that, because you
don't know what the problem was. My guess would be that what you
are mostly seeing is a line in the email content that is over 2kB,
which causes 'retr' to throw a "line too long" exception.

You then blindly throw a "DELE" at the server, and when you try to
read the response to that command it throws another "line too long"
exception because (a) the server's actually still in the middle of
sending the email contents and (b) there's a bug in the SSL poplib
which means once it's thrown "line too long" it will keep doing so
repeatedly.

So what I think you need to do is:

  (a) after your "import poplib" add "poplib._MAXLINE = 10*1024*1024"
  or somesuch (i.e. increase it a lot),

  (b) get rid of your "except:" and work out what you really meant,
  checking what the error returned was before blindly throwing
  commands at a POP3 server in an unknown state. You may well
  need to disconnect and reconnect before continuing - or indeed
  you may well not need to catch any exception at all at this
  point after doing (a).
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Pyhon 2.x or 3.x, which is faster?

2016-03-10 Thread Chris Angelico
On Fri, Mar 11, 2016 at 12:18 AM, Mark Lawrence  wrote:
> I have dealt with some complete dumbos in the years that I've been online,
> but when it comes to thickos you're right up there with the RUE, and I can
> assure you that this is meant to be an insult.
>
> To your way of thinking run time speed is the sole issue with programming,
> and trivial details like accuracy are irrelevant. "Look, Python has taken a
> whole minute to process this data, but BartC has done it in one nanosecond".
> "Yes, but Python is 100% accurate, BartC is 100% inaccurate". "Who cares,
> only speed counts".

Mark, you don't need to be this vitriolic. It's possible to disagree
with Bart without being a complete  yourself. Please, have a
read of my previous post, and give yourself a moment to cool down. At
no time has Bart ever said that accuracy is unimportant; at worst,
he's sacrificing dynamism, maybe maintainability, but not accuracy.

A little calmness, please.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Encapsulation in Python

2016-03-10 Thread Ben Mezger
Hi all,

I've been studying Object Oriented Theory using Java. Theoretically, all
attributes should be private, meaning no one except the methods itself
can access the attribute;

public class Foo {
private int bar;
...

Normally in Java, we would write getters and setters to set/get the
attribute bar. However, in Python, we normally create a class like so;

class Foo(object):
bar = 0
...

And we usually don't write any getters/setters (though they exist in
Python, I have not seen much projects making use of it).

We can easily encapsulate (data hiding) Foo's class using the '_'
(underscore) when creating a new attribute, however, this would require
all attributes to have a underscore.
According to this answer [1], it's acceptable to to expose your
attribute directly (Foo.bar = 0), so I wonder where the encapsulation
happens in Python? If I can access the attribute whenever I want (with
the except of using a underscore), what's the best way to encapsulate a
class in Python? Why aren't most of the projects not using
getters/setters and instead they access the variable directly?

Regards,

Ben Mezger

[1] - http://stackoverflow.com/q/4555932



signature.asc
Description: OpenPGP digital signature
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Pyhon 2.x or 3.x, which is faster?

2016-03-10 Thread Mark Lawrence

On 10/03/2016 13:08, Chris Angelico wrote:


When people want better performance out of a number-crunching Python
program, they have a few options. One is to rewrite their code in C or
Fortran or something. Another is to make small tweaks so the bulk of
the work is handled by numpy or Cython. A third is to keep their code
completely unchanged, but run it under PyPy instead of whatever they
were previously using (probably CPython). Generally, rewriting in
C/Fortran is generally a bad idea; you pay the price over the whole
application, when optimizing a small subset of it would give 99% of
the performance improvement. That's why actual CPython byte-code
interpretation performance isn't so critical; if we can change 5% of
the code so it uses numpy, we keep 95% of it in idiomatic Python,
while still having the bulk of the work done in Fortran. CPython has
other priorities than performance - not to say that "slow is fine",
but more that "slow and dynamic opens up possibilities that fast and
static precludes, so we're happy to pay the price for the features we
want".

ChrisA



This should be the first option 
https://wiki.python.org/moin/PythonSpeed/PerformanceTips


--
My fellow Pythonistas, ask not what our language can do for you, ask
what you can do for our language.

Mark Lawrence

--
https://mail.python.org/mailman/listinfo/python-list


Re: Encapsulation in Python

2016-03-10 Thread Mark Lawrence

On 10/03/2016 13:41, Ben Mezger wrote:

Hi all,

I've been studying Object Oriented Theory using Java. Theoretically, all
attributes should be private, meaning no one except the methods itself
can access the attribute;


I suggest that you read 
http://dirtsimple.org/2004/12/python-is-not-java.html and 
http://dirtsimple.org/2004/12/java-is-not-python-either.html




public class Foo {
 private int bar;
 ...

Normally in Java, we would write getters and setters to set/get the
attribute bar. However, in Python, we normally create a class like so;

class Foo(object):
 bar = 0
 ...

And we usually don't write any getters/setters (though they exist in
Python, I have not seen much projects making use of it).


Python programmers in the main see getters/setters as unneeded, time 
wasting boilerplate.




We can easily encapsulate (data hiding) Foo's class using the '_'
(underscore) when creating a new attribute, however, this would require
all attributes to have a underscore.


No, this is merely a convention that can be worked around if you really 
want to.  The same applies to the use of the double underscore.



According to this answer [1], it's acceptable to to expose your
attribute directly (Foo.bar = 0), so I wonder where the encapsulation
happens in Python? If I can access the attribute whenever I want (with
the except of using a underscore), what's the best way to encapsulate a
class in Python? Why aren't most of the projects not using
getters/setters and instead they access the variable directly?


You have misunderstood.  The '_' is just a convention that says, "this 
is private, please keep your mitts off".  There is nothing to stop a 
programmer from using it.




Regards,

Ben Mezger

[1] - http://stackoverflow.com/q/4555932



I suggest that you reread the stackoverflow link that you've quoted, and 
take great notice of the response from Lennart Regebro, even if it has 
been downvoted.


--
My fellow Pythonistas, ask not what our language can do for you, ask
what you can do for our language.

Mark Lawrence

--
https://mail.python.org/mailman/listinfo/python-list


Re: How to program round this poplib error?

2016-03-10 Thread cl
Jon Ribbens  wrote:
> On 2016-03-10, c...@isbd.net  wrote:
> > # Read each message into a string and then parse with the email 
> > module, if 
> > # there's an error retrieving the message then just throw it away 
> > # 
> > try:
> > popmsg = pop3.retr(i+1)
> > except:
> > pop3.dele(i+1)
> > continue
> >
> > The trouble is that the error is (presumably) some sort of buffer size
> > limitation in pop3.dele().  If I trap the error then I still can't get
> > rid of the rogue E-Mail and, more to the point, I can't even identify
> > it so that the trap could report the error and tell me which message
> > was causing it.
> 
> You really, really should not be using bare "except:".
> Always specify which exceptions you are trying to catch.
> 
Yes, I know, but it doesn't really relate to the problem does it.


> In this case, I think there are two problems. Firstly, I think
> whoever implemented poplib mis-read the POP3 specification, as
> they are applying the line-length limit to not just the POP3
> commands and responses, but the email contents too.
> 
> Secondly, you are just trying to carry on with the POP3 connection
> after it has thrown an exception. You can't do that, because you
> don't know what the problem was. My guess would be that what you
> are mostly seeing is a line in the email content that is over 2kB,
> which causes 'retr' to throw a "line too long" exception.
> 
> You then blindly throw a "DELE" at the server, and when you try to
> read the response to that command it throws another "line too long"
> exception because (a) the server's actually still in the middle of
> sending the email contents and (b) there's a bug in the SSL poplib
> which means once it's thrown "line too long" it will keep doing so
> repeatedly.
> 
> So what I think you need to do is:
> 
>   (a) after your "import poplib" add "poplib._MAXLINE = 10*1024*1024"
>   or somesuch (i.e. increase it a lot),
> 
Ah, that's a much better way of doing it than actually changing the
code, thank you.


>   (b) get rid of your "except:" and work out what you really meant,
>   checking what the error returned was before blindly throwing
>   commands at a POP3 server in an unknown state. You may well
>   need to disconnect and reconnect before continuing - or indeed
>   you may well not need to catch any exception at all at this
>   point after doing (a).

Yes, hopefully the exception will go away.

Thank you again.

-- 
Chris Green
·
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: How to program round this poplib error?

2016-03-10 Thread cl
Mark Lawrence  wrote:
> On 10/03/2016 12:04, c...@isbd.net wrote:
> > I have a (fairly simple) Python program that scans through a
> > 'catchall' E-Mail address for things that *might* be for me.  It sends
> > anything that could be for me to my main E-Mail and discards the rest.
> >
> > However I *occasionally* get an error from it as follows:-
> >
> >  Traceback (most recent call last):
> >File "/home/chris/.mutt/bin/getCatchall.py", line 65, in 
> >  pop3.dele(i+1)
> >File "/usr/lib/python2.7/poplib.py", line 240, in dele
> >  return self._shortcmd('DELE %s' % which)
> >File "/usr/lib/python2.7/poplib.py", line 160, in _shortcmd
> >  return self._getresp()
> >File "/usr/lib/python2.7/poplib.py", line 132, in _getresp
> >  resp, o = self._getline()
> >File "/usr/lib/python2.7/poplib.py", line 377, in _getline
> >  raise error_proto('line too long')
> >  poplib.error_proto: line too long
> >
> >
> > Does anyone have any idea how I can program around this somehow?  As
> > it is at the moment I have to go to the webmail system at my ISP and
> > manually delete the message which is a bit of a nuisance.
> >
> 
> How about a try/except in your code that catches poplib.error_proto?
> 
... and?  I'm still stuck because I can't identify the E-Mail in any
way to enable me to go and find it and delete it.  So the program
keeps trapping on the same E-Mail and never gets to process anything
after that.

-- 
Chris Green
·
-- 
https://mail.python.org/mailman/listinfo/python-list


EuroPython 2016: Talk voting will start on Monday

2016-03-10 Thread M.-A. Lemburg
Having received almost 300 great proposals for talks, trainings,
helpdesks and posters, we now call out to all attendees to vote for
what you want to see on the conference schedule.

Please note that you have to have a ticket for EuroPython 2016, or
have submitted a talk proposal yourself, in order to participate.


​Attendees: This will be your chance to
shape the conference !


You will be able to search for topics and communicate your personal
interest by casting your vote for each talk and training submission on
our talk voting page:


*** https://ep2016.europython.eu/en/talk-voting/ ***

   Talk Voting


Talk voting will be open from Monday, March 14, until Sunday, March 20.

The program workgroup (WG) will then use the talk voting results as
basis for their talk selection and announce the list of accepted talks
late in March and the schedule shortly thereafter in April.


With gravitational regards,
--
EuroPython 2016 Team
http://ep2016.europython.eu/
http://www.europython-society.org/
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Pyhon 2.x or 3.x, which is faster?

2016-03-10 Thread BartC

On 10/03/2016 13:08, Chris Angelico wrote:

On Thu, Mar 10, 2016 at 11:47 PM, BartC  wrote:



You just don't get it.


Both of you "just don't get" something that the other sees as
critically important. Before this thread gets to fisticuffs, may I
please summarize the points that probably nobody will concede?

1) Unicode support, intrinsic to the language, is crucial, even if
BartC refuses to recognize this. Anything released beyond the confines
of his personal workspace will need full Unicode support, otherwise it
is a problem to the rest of the world, and should be destroyed with
fire. Thanks.


I don't agree. If I distribute some text in the form of a series of 
ASCII byte values (eg. classic TXT format, with either kind of line 
separator), then that same data can be directly interpreted as UTF-8.


(And as you know, the first 128 Unicode code points correspond with the 
128 ASCII codes. Widening such a data-set so that each 8-bit character 
becomes 32-bit will also give you a set of Unicode code-points.)


Importing a text file from elsewhere is a different problem of course. 
Although out of the thousands of times I must have done this, 
Unicode-related issues have been minimal.



3) There is a point at which performance ceases to matter for a given
application. This point varies from app to app, but generally is
reached when I/O wait time dwarfs CPU usage.


Also when the total runtime is negligible anyway. It doesn't matter if a 
program takes 200msec instead of 20msec. (Unless millions of such tasks 
are scheduled.)



When people want better performance out of a number-crunching Python
program, they have a few options. One is to rewrite their code in C or
Fortran or something. Another is to make small tweaks so the bulk of
the work is handled by numpy or Cython. A third is to keep their code
completely unchanged, but run it under PyPy instead of whatever they
were previously using (probably CPython). Generally, rewriting in
C/Fortran is generally a bad idea; you pay the price over the whole
application, when optimizing a small subset of it would give 99% of
the performance improvement. That's why actual CPython byte-code
interpretation performance isn't so critical; if we can change 5% of
the code so it uses numpy, we keep 95% of it in idiomatic Python,
while still having the bulk of the work done in Fortran. CPython has
other priorities than performance - not to say that "slow is fine",
but more that "slow and dynamic opens up possibilities that fast and
static precludes, so we're happy to pay the price for the features we
want".


Generally agree. But also, I often develop an algorithm using a dynamic 
language, because it's much easier and quicker to try out different 
things, before porting the result to a static language.


But during development, it doesn't hurt if the dynamic version isn't 
quite so slow!


--
Bartc
--
https://mail.python.org/mailman/listinfo/python-list


Re: Pyhon 2.x or 3.x, which is faster?

2016-03-10 Thread BartC

On 10/03/2016 02:29, Ben Finney wrote:

BartC  writes:



So long as /someone else/ uses the hard language to created the needed
libraries, the speed of pure Python is irrelevant. New version of
Python is now half the speed? Another shrug!


Citation needed. I don't know of any released version of Python that was
ever “twice as slow” – no qualifiers – than the previous release.


That was an exaggeration. Yet there was some basis in fact: elsewhere in 
the thread, I gave an example of a two-line loop that took 2.1 times as 
long to execute in Py3.4.3 as on Py2.7.11.


With some intervening versions, but a 2.7.11 user upgrading direct to 
3.4.3 could be in for a shock, if he's into running pointless loops!


--
bartc



--
https://mail.python.org/mailman/listinfo/python-list


Re: Text input with keyboard, via input methods

2016-03-10 Thread Jerry Hill
On Thu, Mar 10, 2016 at 5:51 AM, Marko Rauhamaa  wrote:
> I don't have an answer. I have requirements, though:
>
>  * I should be able to get the character by knowing its glyph (shape).
>
>  * It should be very low-level and work system-wide, preferably over the
>network (I'm typing this over the network).

This probably doesn't meet all your needs, but there are web services
that get you to draw an approximation of a glyph, then present you
with some possible unicode characters to match. I thought there were a
couple of these sites, but on a quick search, I only find one for
unicode glyphs: http://shapecatcher.com/  and another one for TeX
codes: http://detexify.kirelabs.org/classify.html

-- 
Jerry
-- 
https://mail.python.org/mailman/listinfo/python-list


RE: Encapsulation in Python

2016-03-10 Thread Dan Strohl via Python-list
> I've been studying Object Oriented Theory using Java. Theoretically, all
> attributes should be private, meaning no one except the methods itself can
> access the attribute;
> 
> public class Foo {
> private int bar;
> ...

Why?  I mean sure, lots of them should be, but if I am doing something like:

class person:
 age = 21
 name = 'Cool Dude'

And if I am not doing anything with the information, why make it private... I 
am only going to crete a getter/setter that directly accesses it anyway.

Sure, if I think I am going to do something with it later, I will often "hide" 
it using one method or another...

class person():
_name = ('Cool','Dude')

def fname(self):
return self._name[0]

> 
> Normally in Java, we would write getters and setters to set/get the attribute
> bar. However, in Python, we normally create a class like so;
> 
> class Foo(object):
> bar = 0
> ...
> 
> And we usually don't write any getters/setters (though they exist in Python, I
> have not seen much projects making use of it).
> 

Lots of projects do use these, but mostly (in my experience) these are 
libraries that are designed to provide easy to use classes / methods to 
developers so that they don’t have to figure things out.  Implementing 
getters/setters is more complex and takes more code (and is more to 
troubleshoot / go wrong).  So, if you don’t need it, why not stick with 
something simple?

> We can easily encapsulate (data hiding) Foo's class using the '_'
> (underscore) when creating a new attribute, however, this would require all
> attributes to have a underscore.

Keep in mind that this doesn’t really hide the data, I can still access it 
(foo._bar = 0), even using the double underscore doesn’t actually "hide" the 
data, it just makes it harder to accidently override in instances.  The 
underscore is simply a convention that most people choose to use to suggest 
that _this is an internal var and should be used with caution.

> According to this answer [1], it's acceptable to to expose your attribute
> directly (Foo.bar = 0), so I wonder where the encapsulation happens in
> Python? If I can access the attribute whenever I want (with the except of
> using a underscore), what's the best way to encapsulate a class in Python?

Encapsulation can happen if the developer wants it by using any of a number of 
approaches (__getattr__/__setattr__, __getattribute__, __get__/__set__, 
property(), @property, etc...), python allows the developer to define where it 
makes sense to use it, and where not to.

> Why aren't most of the projects not using getters/setters and instead they
> access the variable directly?

I don’t know about "most" projects... but getters/setters (in one form or 
another) are used often in lots of projects... but like I mentioned above, if 
the project doesn’t need to encapsulate the info, why do it?  (keep in mind 
that I can always rewrite the class and add encapsulation later if I need to).

One note here, be careful of getting too caught up in using any single language 
(no matter which one) to define Object Oriented Theory, each approaches it in 
its own way, and each has its own benefits and challenges.  This is a perfectly 
valid question (and a good one), but don’t let yourself get into the trap of 
feeling that Java is the definitive/best/only approach to OO (or Python for 
that matter, or C++, or whatever!).

Dan Strohl

> 
> Regards,
> 
> Ben Mezger
> 
> [1] - http://stackoverflow.com/q/4555932

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Text input with keyboard, via input methods

2016-03-10 Thread Marko Rauhamaa
Jerry Hill :

> On Thu, Mar 10, 2016 at 5:51 AM, Marko Rauhamaa  wrote:
>> I don't have an answer. I have requirements, though:
>>
>>  * I should be able to get the character by knowing its glyph (shape).
>>
>>  * It should be very low-level and work system-wide, preferably over the
>>network (I'm typing this over the network).
>
> [...]
> http://shapecatcher.com/

Nice!

Now I need this integrated with the keyboard.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Encapsulation in Python

2016-03-10 Thread Ian Kelly
On Thu, Mar 10, 2016 at 6:41 AM, Ben Mezger  wrote:
> Hi all,
>
> I've been studying Object Oriented Theory using Java. Theoretically, all
> attributes should be private, meaning no one except the methods itself
> can access the attribute;
>
> public class Foo {
> private int bar;
> ...

Encapsulation in Python is based on trust rather than the
authoritarian style of C++ / Java. The maxim in the Python community
is that "we're all consenting adults". If I don't intend an attribute
to be messed with, then I'll mark it with a leading underscore. If you
mess with it anyway, and later your code breaks as a result of it,
that's your problem, not mine. :-)

> Normally in Java, we would write getters and setters to set/get the
> attribute bar. However, in Python, we normally create a class like so;
>
> class Foo(object):
> bar = 0
> ...
>
> And we usually don't write any getters/setters (though they exist in
> Python, I have not seen much projects making use of it).

The vast majority of getters and setters do nothing other than get/set
the field they belong to. They exist only to allow the *possibility*
of doing something else at some point far in the future. That's a ton
of undesirable boilerplate for little real benefit.

In Python, OO designers are able to get away with not using getters
and setters because we have properties. You can start with an
attribute, and if you later want to change the means of getting and
setting it, you just replace it with a property. The property lets you
add any logic you want, and as far as the outside world is concerned,
it still just looks like an attribute.

> We can easily encapsulate (data hiding) Foo's class using the '_'
> (underscore) when creating a new attribute, however, this would require
> all attributes to have a underscore.

A single leading underscore is just a naming convention, not true data
hiding, which isn't really possible in Python. Even the double
underscore only does name mangling, not true data hiding. It's meant
to prevent *accidental* naming collisions. You can still easily access
the attribute from outside the class if you're determined to.

This all boils down to the fact that code inside a method has no
special privilege over external code. If you could hide data so well
that external code really couldn't access it, then you wouldn't be
able to access it either.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Simple exercise

2016-03-10 Thread Thomas 'PointedEars' Lahn
Thomas 'PointedEars' Lahn wrote:

[
> key = m.group(1)
> value = int(m.group(2))
> 
> if key not in od:
>   od[key] = value
> else:
>   od[key] += value
> 
> But there is probably an even more pythonic way to do this.
]

For example, based on the original code:

recs = int(input())
od = OrderedDict()
items = []

for _ in range(recs):
file_input = sys.stdin.readline().strip()
m = re.search(r"(\w.+)\s+(\d+)", file_input)
if m: items.append(m.group(1, 2))

od = OrderedDict(map(lambda item: (item[0], 0), items))
for item in items: od[item[0]] += item[1]

-- 
PointedEars

Twitter: @PointedEars2
Please do not cc me. / Bitte keine Kopien per E-Mail.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Simple exercise

2016-03-10 Thread Thomas 'PointedEars' Lahn
Thomas 'PointedEars' Lahn wrote:

> od = OrderedDict()

This is pointless, then.

> […]
> od = OrderedDict(map(lambda item: (item[0], 0), items))
> for item in items: od[item[0]] += item[1]

-- 
PointedEars

Twitter: @PointedEars2
Please do not cc me. / Bitte keine Kopien per E-Mail.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: python domain in China. This showed up on Python list

2016-03-10 Thread Uri Even-Chen
I don't register domain names in countries - any domain name which ends
with 2 letters. If you want you can register my name in .cn or any other
country, I don't care.


*Uri Even-Chen*
[image: photo] Phone: +972-54-3995700
Email: u...@speedy.net
Website: http://www.speedysoftware.com/uri/en/
  
    

On Wed, Dec 2, 2015 at 1:42 AM, Steven D'Aprano  wrote:

> On Tue, 1 Dec 2015 10:49 pm, Laura Creighton wrote:
>
> > In a message of Tue, 01 Dec 2015 02:51:21 -0800, Chris Rebert writes:
> >>I hate to break it to you, but this seems to be just another of those
> >>come-ons spammed out by various scummy businesses that trawl WHOIS
> >>databases for people to scam into buying extra/unnecessary domain
> >>names. Google "chinese domain scam" for more info. I've received
> >>similar spams after having registered some .com domains that no
> >>corporation could possibly legitimately want the .cn equivalents of.
> >
> > Ah...  Thank you Chris.  Sure fooled me.
>
>
> You're not the only one. At my day job, we get dozens of these, about one
> or
> two a month, and the first time it happened, I responded, at which point
> they told us that if we paid $MANY we could register the domain  name>.cn before somebody else did.
>
> At that point, we lost interest, as we have no business interests in China.
> If somebody wants to register our name in China, let them.
>
>
>
> --
> Steven
>
> --
> https://mail.python.org/mailman/listinfo/python-list
>
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: looping and searching in numpy array

2016-03-10 Thread Heli
On Thursday, March 10, 2016 at 2:02:57 PM UTC+1, Peter Otten wrote:
> Heli wrote:
> 
> > Dear all,
> > 
> > I need to loop over a numpy array and then do the following search. The
> > following is taking almost 60(s) for an array (npArray1 and npArray2 in
> > the example below) with around 300K values.
> > 
> > 
> > for id in np.nditer(npArray1):
> >   
> >newId=(np.where(npArray2==id))[0][0]
> > 
> > 
> > Is there anyway I can make the above faster? I need to run the script
> > above on much bigger arrays (50M). Please note that my two numpy arrays in
> > the lines above, npArray1 and npArray2  are not necessarily the same size,
> > but they are both 1d.
> 
> You mean you are looking for the index of the first occurence in npArray2 
> for every value of npArray1?
> 
> I don't know how to do this in numpy (I'm not an expert), but even basic 
> Python might be acceptable:
> 
> lookup = {}
> for i, v in enumerate(npArray2):
> if v not in lookup:
> lookup[v] = i
> 
> for v in npArray1:
> print(lookup.get(v, ""))
> 
> That way you iterate once (in Python) instead of 2*len(npArray1) times (in 
> C) over npArray2.

Dear Peter, 

Thanks for your reply. This really helped. It reduces the script time from 
61(s) to 2(s). 

I am still very interested in knowing the correct numpy way to do this, but 
till then your fix works great. 

Thanks a lot, 
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: looping and searching in numpy array

2016-03-10 Thread Heli
On Thursday, March 10, 2016 at 5:49:07 PM UTC+1, Heli wrote:
> On Thursday, March 10, 2016 at 2:02:57 PM UTC+1, Peter Otten wrote:
> > Heli wrote:
> > 
> > > Dear all,
> > > 
> > > I need to loop over a numpy array and then do the following search. The
> > > following is taking almost 60(s) for an array (npArray1 and npArray2 in
> > > the example below) with around 300K values.
> > > 
> > > 
> > > for id in np.nditer(npArray1):
> > >   
> > >newId=(np.where(npArray2==id))[0][0]
> > > 
> > > 
> > > Is there anyway I can make the above faster? I need to run the script
> > > above on much bigger arrays (50M). Please note that my two numpy arrays in
> > > the lines above, npArray1 and npArray2  are not necessarily the same size,
> > > but they are both 1d.
> > 
> > You mean you are looking for the index of the first occurence in npArray2 
> > for every value of npArray1?
> > 
> > I don't know how to do this in numpy (I'm not an expert), but even basic 
> > Python might be acceptable:
> > 
> > lookup = {}
> > for i, v in enumerate(npArray2):
> > if v not in lookup:
> > lookup[v] = i
> > 
> > for v in npArray1:
> > print(lookup.get(v, ""))
> > 
> > That way you iterate once (in Python) instead of 2*len(npArray1) times (in 
> > C) over npArray2.
> 
> Dear Peter, 
> 
> Thanks for your reply. This really helped. It reduces the script time from 
> 61(s) to 2(s). 
> 
> I am still very interested in knowing the correct numpy way to do this, but 
> till then your fix works great. 
> 
> Thanks a lot,

And yes, I am  looking for the index of the first occurence in npArray2 
for every value of npArray1.
-- 
https://mail.python.org/mailman/listinfo/python-list


HDF5 data set, unable to read contents

2016-03-10 Thread varun7rs
Hello everyone,

I recently came across a package called matrix2latex which simplifies the 
creation of very large tables. So, I saved my data in .mat format using the 
'-v7.3' extension so that they can be read by python.

Now, when I try to read these files, I'm having some problems in reproducing 
the whole matrix. Whenever I try and print out an element of the array, I only 
get 'HDF5 object reference' as the output. This is basically the short code I 
wrote and what I basically want to do is have the values of the elements in the 
arrays and not the 'HDF5 object reference' thing. ex [ [ 1 2 3 4 ], [2 3 4 5 
]...]

Could you please help me with this? The link to the .mat file is as below

https://drive.google.com/file/d/0B2-j91i19ey2Nm1CbVpCQVdZc3M/view?usp=sharing


import numpy as np, h5py
from matrix2latex import matrix2latex

f = h5py.File('nominal_case_unfix.mat', 'r')
data = f.get('nominal_case_unfix')
np_data = np.array(data)
print np_data



Thank You
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Read and count

2016-03-10 Thread Val Krem via Python-list
Thank you very much for the help.

First I want count by city and year. 
City year count
Xc1.2001.  1
Xc1.2002.  3
Yv1. 2001.  1
Yv2.2002.  4
This worked fine !

Now I want to count by city only
City. Count
Xc1.   4
Yv2.  5

Then combine these two objects with the original data and send it to a file 
called  "detout" with these columns:

"City", " year ", "x ", "cycount ", "citycount"

Many thanks again






This worked fine. I tried to count only by city  and combine the three objects 
together 

City
Xc1  4
Yv2  5



Sent from my iPad 

> On Mar 10, 2016, at 3:11 AM, Jussi Piitulainen 
>  wrote:
> 
> Val Krem writes:
> 
>> Hi all,
>> 
>> I am a new learner about python (moving from R to python) and trying
>> read and count the number of observation by year for each city.
>> 
>> 
>> The data set look like
>> city year  x 
>> 
>> XC1 2001  10
>> XC1   2001  20
>> XC1   2002   20
>> XC1   2002   10
>> XC1 2002   10
>> 
>> Yv2 2001   10
>> Yv2 2002   20
>> Yv2 2002   20
>> Yv2 2002   10
>> Yv2 2002   10
>> 
>> out put will be
>> 
>> city
>> xc1  2001  2
>> xc1   2002  3
>> yv1  2001  1
>> yv2  2002  3
>> 
>> 
>> Below is my starting code
>> count=0
>> fo=open("dat", "r+")
>> str = fo.read();
>> print "Read String is : ", str
>> 
>> fo.close()
> 
> Below's some of the basics that you want to study. Also look up the csv
> module in Python's standard library. You will want to learn these things
> even if you end up using some sort of third-party data-frame library (I
> don't know those but they exist).
> 
> from collections import Counter
> 
> # collections.Counter is a special dictionary type for just this
> counts = Counter()
> 
> # with statement ensures closing the file
> with open("dat") as fo:
># file object provides lines
>next(fo) # skip header line
>for line in fo:
># test requires non-empty string, but lines
># contain at least newline character so ok
>if line.isspace(): continue
># .split() at whitespace, omits empty fields
>city, year, x = line.split()
># collections.Counter has default 0,
># key is a tuple (city, year), parentheses omitted here
>counts[city, year] += 1
> 
> print("city")
> for city, year in sorted(counts): # iterate over keys
>print(city.lower(), year, counts[city, year], sep = "\t")
> 
> # Alternatively:
> # for cy, n in sorted(counts.items()):
> #   city, year = cy
> #   print(city.lower(), year, n, sep = "\t")
> -- 
> https://mail.python.org/mailman/listinfo/python-list
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Review Request of Python Code

2016-03-10 Thread subhabangalore
On Wednesday, March 9, 2016 at 9:49:17 AM UTC+5:30, subhaba...@gmail.com wrote:
> Dear Group,
> 
> I am trying to write a code for pulling data from MySQL at the backend and 
> annotating words and trying to put the results as separated sentences with 
> each line. The code is generally running fine but I am feeling it may be 
> better in the end of giving out sentences, and for small data sets it is okay 
> but with 50,000 news articles it is performing dead slow. I am using 
> Python2.7.11 on Windows 7 with 8GB RAM. 
> 
> I am trying to copy the code here, for your kind review. 
> 
> import MySQLdb
> import nltk
> def sql_connect_NewTest1():
> db = MySQLdb.connect(host="localhost",
>  user="*", 
>  passwd="*",  
>  db="abcd_efgh")
> cur = db.cursor()
> #cur.execute("SELECT * FROM newsinput limit 0,5;") #REPORTING RUNTIME 
> ERROR
> cur.execute("SELECT * FROM newsinput limit 0,50;")
> dict_open=open("/python27/NewTotalTag.txt","r") #OPENING THE DICTIONARY 
> FILE 
> dict_read=dict_open.read() 
> dict_word=dict_read.split()
> a4=dict_word #Assignment for code. 
> list1=[]
> flist1=[]
> nlist=[]
> for row in cur.fetchall():
> #print row[2]
> var1=row[3]
> #print var1 #Printing lines
> #var2=len(var1) # Length of file
> var3=var1.split(".") #SPLITTING INTO LINES
> #print var3 #Printing The Lines 
> #list1.append(var1)
> var4=len(var3) #Number of all lines
> #print "No",var4
> for line in var3:
> #print line
> #flist1.append(line)
> linew=line.split()
> for word in linew:
> if word in a4:
> windex=a4.index(word)
> windex1=windex+1
> word1=a4[windex1]
> word2=word+"/"+word1
> nlist.append(word2)
> #print list1
> #print nlist
> elif word not in a4:
> word3=word+"/"+"NA"
> nlist.append(word3)
> #print list1
> #print nlist
> else:
> print "None"
> 
> #print "###",flist1
> #print len(flist1)
> #db.close()
> #print nlist
> lol = lambda lst, sz: [lst[i:i+sz] for i in range(0, len(lst), sz)] 
> #TRYING TO SPLIT THE RESULTS AS SENTENCES 
> nlist1=lol(nlist,7)
> #print nlist1
> for i in nlist1:
> string1=" ".join(i)
> print i
> #print string1
> 
>
> Thanks in Advance.


Dear Group,

Thank you all, for your kind time and all suggestions in helping me.

Thank you Steve for writing the whole code. It is working full 
and fine. But speed is still an issue. We need to speed up. 

Inada I tried to change to 
cur = db.cursor(MySQLdb.cursors.SSCursor) but my System Admin 
said that may not be an issue.

Freidrich, my problem is I have a big text repository of .txt
files in MySQL in the backend. I have another list of words with
their possible tags. The tags are not conventional Parts of Speech(PoS)
tags,  and bit defined by others. 
The code is expected to read each file and its each line.
On reading each line it will scan the list for appropriate
tag, if it is found it would assign, else would assign NA.
The assignment should be in the format of /tag, so that
if there is a string of n words, it should look like,
w1/tag w2/tag w3/tag w4/tag wn/tag, 

where tag may be tag in the list or NA as per the situation.

This format is taken because the files are expected to be tagged
in Brown Corpus format. There is a Python Library named NLTK.
If I want to save my data for use with their models, I need 
some specifications. I want to use it as Tagged Corpus format. 

Now the tagged data coming out in this format, should be one 
tagged sentences in each new line or a lattice. 

They expect the data to be saved in .pos format but presently 
I am not doing in this code, I may do that later. 

Please let me know if I need to give any more information.

Matt, thank you for if...else suggestion, the data of NewTotalTag.txt
is like a simple list of words with unconventional tags, like,

w1 tag1
w2 tag2
w3 tag3
...
...
w3  tag3

like that. 

Regards,
Subhabrata  

  
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Simple exercise

2016-03-10 Thread Terry Reedy

On 3/10/2016 4:02 AM, Rodrick Brown wrote:

 From the following input

9
BANANA FRIES 12
POTATO CHIPS 30
APPLE JUICE 10
CANDY 5
APPLE JUICE 10
CANDY 5
CANDY 5
CANDY 5
POTATO CHIPS 30

I'm expecting the following output
BANANA FRIES 12
POTATO CHIPS 60
APPLE JUICE 20
CANDY 20

However my code seems be returning incorrect value


Learn to debug.  The incorrect value is the one for candy.  First, 
reduce you input to the candy lines.  Still get wrong answer?  Then 
print the value in od within the loop after each calculation to see when 
and how it goes wrong.



#!/usr/bin/env python3

import sys
import re
from collections import OrderedDict

if __name__ == '__main__':

   od = OrderedDict()
   recs = int(input())

   for _ in range(recs):
 file_input = sys.stdin.readline().strip()
 m = re.search(r"(\w.+)\s+(\d+)", file_input)

 if m:
   if m.group(1) not in od.keys():
 od[m.group(1)] = int(m.group(2))
   else:
 od[m.group(1)] += int(od.get(m.group(1),0))
   for k,v in od.items():
 print(k,v)

What's really going on here?

$ cat groceries.txt | ./groceries.py
BANANA FRIES 12
POTATO CHIPS 60
APPLE JUICE 20
CANDY 40




--
Terry Jan Reedy

--
https://mail.python.org/mailman/listinfo/python-list


context managers inline?

2016-03-10 Thread Neal Becker
Is there a way to ensure resource cleanup with a construct such as:

x = load (open ('my file', 'rb))

Is there a way to ensure this file gets closed?

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Review Request of Python Code

2016-03-10 Thread BartC

On 10/03/2016 18:12, subhabangal...@gmail.com wrote:

On Wednesday, March 9, 2016 at 9:49:17 AM UTC+5:30, subhaba...@gmail.com wrote:



Thank you Steve for writing the whole code. It is working full
and fine. But speed is still an issue. We need to speed up.


Which bit is too slow? (Perhaps the print statements in your original 
code will give a clue.)


How many rows, lines and words are we talking about (ie. how many inner 
loops)? How big is the text file? Is the outer function called once and 
that shows the problem, or many times?


It might be that the task is big enough that it actually takes that long.

--
Bartc
--
https://mail.python.org/mailman/listinfo/python-list


Re: context managers inline?

2016-03-10 Thread sohcahtoa82
On Thursday, March 10, 2016 at 10:33:47 AM UTC-8, Neal Becker wrote:
> Is there a way to ensure resource cleanup with a construct such as:
> 
> x = load (open ('my file', 'rb))
> 
> Is there a way to ensure this file gets closed?

with open('my file', 'rb') as f:
x = load(f)

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Review Request of Python Code

2016-03-10 Thread Matt Wheeler
On 10 March 2016 at 18:12,   wrote:
> Matt, thank you for if...else suggestion, the data of NewTotalTag.txt
> is like a simple list of words with unconventional tags, like,
>
> w1 tag1
> w2 tag2
> w3 tag3
> ...
> ...
> w3  tag3
>
> like that.

I suspected so. The way your code currently works, if your input text
contains one of the tags, e.g. 'tag1' you'll get an entry in your
output something like 'tag1/w2'. I assume you don't want that :).

This is because you're using a single list to include all of the tags.
Try something along the lines of:

dict_word={} #empty dictionary
for line in dict_read.splitlines():
word, tag = line.split(' ')
dict_word[word] = tag

Notice I'm using splitlines() instead of split() to do the initial
chopping up of your input. split() will split on any whitespace by
default. splitlines should be self-explanatory.

I would split this and the file-open out into a separate function at
this point. Large blobs of sequential code are not particularly easy
on the eyes or the brain -- choose a sensible name, like
load_dictionary. Perhaps something you could call like:

dict_word = load_dictionary("NewTotalTag.txt")


You also aren't closing the file that you open at any point -- once
you've loaded the data from it there's no need to keep the file opened
(look up context managers).

-- 
Matt Wheeler
http://funkyh.at
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: context managers inline?

2016-03-10 Thread Neal Becker
sohcahto...@gmail.com wrote:

> On Thursday, March 10, 2016 at 10:33:47 AM UTC-8, Neal Becker wrote:
>> Is there a way to ensure resource cleanup with a construct such as:
>> 
>> x = load (open ('my file', 'rb))
>> 
>> Is there a way to ensure this file gets closed?
> 
> with open('my file', 'rb') as f:
> x = load(f)

But not in a 1-line, composable manner?

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: context managers inline?

2016-03-10 Thread Ian Kelly
On Thu, Mar 10, 2016 at 11:59 AM, Neal Becker  wrote:
> sohcahto...@gmail.com wrote:
>
>> On Thursday, March 10, 2016 at 10:33:47 AM UTC-8, Neal Becker wrote:
>>> Is there a way to ensure resource cleanup with a construct such as:
>>>
>>> x = load (open ('my file', 'rb))
>>>
>>> Is there a way to ensure this file gets closed?
>>
>> with open('my file', 'rb') as f:
>> x = load(f)
>
> But not in a 1-line, composable manner?

def with_(ctx, func):
with ctx as value:
return func(value)

x = with_(open('my file', 'rb'), load)


Seems less readable to me, though.
-- 
https://mail.python.org/mailman/listinfo/python-list


RE: Review Request of Python Code

2016-03-10 Thread Joaquin Alzola
SQL doesn't allow decimal numbers for LIMIT.
Use decimal numbers it still work but is the proper way.

Then clean up a bit your code and remove the commented lines #

-Original Message-
From: Python-list 
[mailto:python-list-bounces+joaquin.alzola=lebara@python.org] On Behalf Of 
subhabangal...@gmail.com
Sent: 10 March 2016 18:12
To: python-list@python.org
Subject: Re: Review Request of Python Code

On Wednesday, March 9, 2016 at 9:49:17 AM UTC+5:30, subhaba...@gmail.com wrote:
> Dear Group,
>
> I am trying to write a code for pulling data from MySQL at the backend and 
> annotating words and trying to put the results as separated sentences with 
> each line. The code is generally running fine but I am feeling it may be 
> better in the end of giving out sentences, and for small data sets it is okay 
> but with 50,000 news articles it is performing dead slow. I am using 
> Python2.7.11 on Windows 7 with 8GB RAM.
>
> I am trying to copy the code here, for your kind review.
>
> import MySQLdb
> import nltk
> def sql_connect_NewTest1():
> db = MySQLdb.connect(host="localhost",
>  user="*",
>  passwd="*",
>  db="abcd_efgh")
> cur = db.cursor()
> #cur.execute("SELECT * FROM newsinput limit 0,5;") #REPORTING RUNTIME 
> ERROR
> cur.execute("SELECT * FROM newsinput limit 0,50;")
> dict_open=open("/python27/NewTotalTag.txt","r") #OPENING THE DICTIONARY 
> FILE
> dict_read=dict_open.read()
> dict_word=dict_read.split()
> a4=dict_word #Assignment for code.
> list1=[]
> flist1=[]
> nlist=[]
> for row in cur.fetchall():
> #print row[2]
> var1=row[3]
> #print var1 #Printing lines
> #var2=len(var1) # Length of file
> var3=var1.split(".") #SPLITTING INTO LINES
> #print var3 #Printing The Lines
> #list1.append(var1)
> var4=len(var3) #Number of all lines
> #print "No",var4
> for line in var3:
> #print line
> #flist1.append(line)
> linew=line.split()
> for word in linew:
> if word in a4:
> windex=a4.index(word)
> windex1=windex+1
> word1=a4[windex1]
> word2=word+"/"+word1
> nlist.append(word2)
> #print list1
> #print nlist
> elif word not in a4:
> word3=word+"/"+"NA"
> nlist.append(word3)
> #print list1
> #print nlist
> else:
> print "None"
>
> #print "###",flist1
> #print len(flist1)
> #db.close()
> #print nlist
> lol = lambda lst, sz: [lst[i:i+sz] for i in range(0, len(lst), sz)] 
> #TRYING TO SPLIT THE RESULTS AS SENTENCES
> nlist1=lol(nlist,7)
> #print nlist1
> for i in nlist1:
> string1=" ".join(i)
> print i
> #print string1
>
>
> Thanks in Advance.


Dear Group,

Thank you all, for your kind time and all suggestions in helping me.

Thank you Steve for writing the whole code. It is working full and fine. But 
speed is still an issue. We need to speed up.

Inada I tried to change to
cur = db.cursor(MySQLdb.cursors.SSCursor) but my System Admin said that may not 
be an issue.

Freidrich, my problem is I have a big text repository of .txt files in MySQL in 
the backend. I have another list of words with their possible tags. The tags 
are not conventional Parts of Speech(PoS) tags,  and bit defined by others.
The code is expected to read each file and its each line.
On reading each line it will scan the list for appropriate tag, if it is found 
it would assign, else would assign NA.
The assignment should be in the format of /tag, so that if there is a string of 
n words, it should look like, w1/tag w2/tag w3/tag w4/tag wn/tag,

where tag may be tag in the list or NA as per the situation.

This format is taken because the files are expected to be tagged in Brown 
Corpus format. There is a Python Library named NLTK.
If I want to save my data for use with their models, I need some 
specifications. I want to use it as Tagged Corpus format.

Now the tagged data coming out in this format, should be one tagged sentences 
in each new line or a lattice.

They expect the data to be saved in .pos format but presently I am not doing in 
this code, I may do that later.

Please let me know if I need to give any more information.

Matt, thank you for if...else suggestion, the data of NewTotalTag.txt is like a 
simple list of words with unconventional tags, like,

w1 tag1
w2 tag2
w3 tag3
...
...
w3  tag3

like that.

Regards,
Subhabrata


--
https://mail.python.org/mailman/listinfo/python-list
This email is confidential and may be subject to privilege. If you are not t

Re: Pyhon 2.x or 3.x, which is faster?

2016-03-10 Thread Mark Lawrence

On 10/03/2016 14:22, BartC wrote:



But during development, it doesn't hurt if the dynamic version isn't
quite so slow!



In [1]: import this
The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!

No mention of speed anywhere, but then what does that silly old Tim 
Peters know about anything?


--
My fellow Pythonistas, ask not what our language can do for you, ask
what you can do for our language.

Mark Lawrence

--
https://mail.python.org/mailman/listinfo/python-list


Re: Encapsulation in Python

2016-03-10 Thread Mark Lawrence

On 10/03/2016 14:57, Dan Strohl via Python-list wrote:

I've been studying Object Oriented Theory using Java. Theoretically, all
attributes should be private, meaning no one except the methods itself can
access the attribute;

public class Foo {
 private int bar;
 ...




For the benefit of any newbies/lurkers I'll just point out that this 
might well be valid Java, but...



Why?  I mean sure, lots of them should be, but if I am doing something like:

class person:
  age = 21
  name = 'Cool Dude'



...this gives you class attributes, so the age is always 21 and the name 
is always 'Cool Dude'.  So you can vary the age and name you'd need:-


class person():
def __init__(self, age, name):
self.age = age
self.name = name

--
My fellow Pythonistas, ask not what our language can do for you, ask
what you can do for our language.

Mark Lawrence

--
https://mail.python.org/mailman/listinfo/python-list


Re: context managers inline?

2016-03-10 Thread Mark Lawrence

On 10/03/2016 18:33, Neal Becker wrote:

Is there a way to ensure resource cleanup with a construct such as:

x = load (open ('my file', 'rb))

Is there a way to ensure this file gets closed?



I don't see how there can be.  Surely you must split it into two lines 
to use the context manager via the 'with' keyword, or you leave the one 
line as is and forego the context manager.


--
My fellow Pythonistas, ask not what our language can do for you, ask
what you can do for our language.

Mark Lawrence

--
https://mail.python.org/mailman/listinfo/python-list


Re: Review Request of Python Code

2016-03-10 Thread Mark Lawrence

On 09/03/2016 04:18, subhabangal...@gmail.com wrote:

Dear Group,

I am trying to write a code for pulling data from MySQL at the backend and 
annotating words and trying to put the results as separated sentences with each 
line. The code is generally running fine but I am feeling it may be better in 
the end of giving out sentences, and for small data sets it is okay but with 
50,000 news articles it is performing dead slow. I am using Python2.7.11 on 
Windows 7 with 8GB RAM.

I am trying to copy the code here, for your kind review.

 cur = db.cursor()
 dict_open=open("/python27/NewTotalTag.txt","r") #OPENING THE DICTIONARY 
FILE


As you've had and acknowledged some sound answers, I'll simply point out 
that many people find the first line above, with just that little bit of 
whitespace, far easier to read than the second.


--
My fellow Pythonistas, ask not what our language can do for you, ask
what you can do for our language.

Mark Lawrence

--
https://mail.python.org/mailman/listinfo/python-list


Re: Pyhon 2.x or 3.x, which is faster?

2016-03-10 Thread Chris Angelico
On Fri, Mar 11, 2016 at 1:22 AM, BartC  wrote:
>>
>> 1) Unicode support, intrinsic to the language, is crucial, even if
>> BartC refuses to recognize this. Anything released beyond the confines
>> of his personal workspace will need full Unicode support, otherwise it
>> is a problem to the rest of the world, and should be destroyed with
>> fire. Thanks.
>
>
> I don't agree. If I distribute some text in the form of a series of ASCII
> byte values (eg. classic TXT format, with either kind of line separator),
> then that same data can be directly interpreted as UTF-8.

What you call "classic TXT format" is still an encoding, which means
you're acknowledging the difference between characters and bytes -
that's the first step. But you have to be certain that you are
interpreting it as UTF-8, in which case ASCII ceases to be
significant, and what you've done is declare that your file consists
of a stream of UTF-8-encoded Unicode characters, divided into lines
with either U+000D U+000A or just U+000A. That's a nice clear encoding
definition.

And the difference between characters and bytes is only the first step
(albeit the biggest and most important step). You _need_ to make sure
that you're thinking about text as text, and that means being aware of
RTL vs LTR, combining characters, case conversions, collations, etc,
etc, etc, all in terms of Unicode rather than as eight-bit or
seven-bit characters. (For example, a naïve MUD client might assume
that one byte is one character is 8 pixels of width. I know this,
because some years ago I wrote one exactly like that (well, the figure
"8" came from measuring the current font, but other than at font
changes, it was fixed). An intelligent Unicode-aware MUD client has to
not only cope with variable width, but also characters that don't have
any width at all, and those that use the same space as their base
character, and those that are placed to the left of the preceding
character.) You can't ignore this, although you might be able to leave
full support for later - but it's a bug until you do.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Review Request of Python Code

2016-03-10 Thread subhabangalore
On Friday, March 11, 2016 at 12:22:31 AM UTC+5:30, Matt Wheeler wrote:
> On 10 March 2016 at 18:12,   wrote:
> > Matt, thank you for if...else suggestion, the data of NewTotalTag.txt
> > is like a simple list of words with unconventional tags, like,
> >
> > w1 tag1
> > w2 tag2
> > w3 tag3
> > ...
> > ...
> > w3  tag3
> >
> > like that.
> 
> I suspected so. The way your code currently works, if your input text
> contains one of the tags, e.g. 'tag1' you'll get an entry in your
> output something like 'tag1/w2'. I assume you don't want that :).
> 
> This is because you're using a single list to include all of the tags.
> Try something along the lines of:
> 
> dict_word={} #empty dictionary
> for line in dict_read.splitlines():
> word, tag = line.split(' ')
> dict_word[word] = tag
> 
> Notice I'm using splitlines() instead of split() to do the initial
> chopping up of your input. split() will split on any whitespace by
> default. splitlines should be self-explanatory.
> 
> I would split this and the file-open out into a separate function at
> this point. Large blobs of sequential code are not particularly easy
> on the eyes or the brain -- choose a sensible name, like
> load_dictionary. Perhaps something you could call like:
> 
> dict_word = load_dictionary("NewTotalTag.txt")
> 
> 
> You also aren't closing the file that you open at any point -- once
> you've loaded the data from it there's no need to keep the file opened
> (look up context managers).
> 
> -- 
> Matt Wheeler
> http://funkyh.at

Dear Matt,

I want in the format of w1/tag1...you may find my detailed problem statement in 
reply of someone else's query. If you feel I would write again for you.

Regards,
Subhabrata
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: context managers inline?

2016-03-10 Thread Chris Angelico
On Fri, Mar 11, 2016 at 5:33 AM, Neal Becker  wrote:
> Is there a way to ensure resource cleanup with a construct such as:
>
> x = load (open ('my file', 'rb))
>
> Is there a way to ensure this file gets closed?

Yep!

def read_file(fn, *a, **kw):
with open(fn, *a, **kw) as f:
return f.read()

Now you can ensure resource cleanup, because the entire file has been
read in before the function returns. As long as your load() function
is okay with reading from a string, this is effective.

Alternatively, push the closing the other way: pass a file name to
your load() function, and have the context manager in there.

If you don't do it one of those ways, the question is: WHEN should the
file be closed? How does Python know when it should go and clean that
up? There's no "end of current expression" rule as there is in C++, so
it's safer to use the statement form, which has a definite end (at the
unindent).

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Text input with keyboard, via input methods

2016-03-10 Thread Chris Angelico
On Fri, Mar 11, 2016 at 2:13 AM, Marko Rauhamaa  wrote:
> Jerry Hill :
>
>> On Thu, Mar 10, 2016 at 5:51 AM, Marko Rauhamaa  wrote:
>>> I don't have an answer. I have requirements, though:
>>>
>>>  * I should be able to get the character by knowing its glyph (shape).
>>>
>>>  * It should be very low-level and work system-wide, preferably over the
>>>network (I'm typing this over the network).
>>
>> [...]
>> http://shapecatcher.com/
>
> Nice!
>
> Now I need this integrated with the keyboard.

Better still, with a Cintiq or something! I know a few artists who
would love playing with that.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Other difference with Perl: Python scripts in a pipe

2016-03-10 Thread Fillmore


when I put a Python script in pipe with other commands, it will refuse 
to let go silently. Any way I can avoid this?


$ python somescript.py | head -5
line 1
line 3
line 3
line 4
line 5
Traceback (most recent call last):
  File "./somescript.py", line 50, in 
sys.stdout.write(row[0])
BrokenPipeError: [Errno 32] Broken pipe
Exception ignored in: <_io.TextIOWrapper name='' mode='w' 
encoding='UTF-8'>

BrokenPipeError: [Errno 32] Broken pipe

thanks
--
https://mail.python.org/mailman/listinfo/python-list


Re: Other difference with Perl: Python scripts in a pipe

2016-03-10 Thread Ian Kelly
On Thu, Mar 10, 2016 at 2:33 PM, Fillmore  wrote:
>
> when I put a Python script in pipe with other commands, it will refuse to
> let go silently. Any way I can avoid this?

What is your script doing? I don't see this problem.

ikelly@queso:~ $ cat somescript.py
import sys

for i in range(20):
sys.stdout.write('line %d\n' % i)
ikelly@queso:~ $ python somescript.py | head -5
line 0
line 1
line 2
line 3
line 4
ikelly@queso:~ $ python3 somescript.py | head -5
line 0
line 1
line 2
line 3
line 4
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Text input with keyboard, via input methods

2016-03-10 Thread Ben Bacarisse
Marko Rauhamaa  writes:

> Ben Finney :
>
>> As for how solved it is, that depends on what you're hoping for as a
>> solution.
>>
>> [...]
>>
>> Hopefully your operating system has a good input method system, with
>> many input methods available to choose from. May you find a decent
>> default there.
>
> I don't have an answer. I have requirements, though:
>
>  * I should be able to get the character by knowing its glyph (shape).
>
>  * It should be very low-level and work system-wide, preferably over the
>network (I'm typing this over the network).

I think you are a Gnus user so you probably already know about
insert-char (usually bound to C-x 8 RET though I've re-bound it to my
"insert" key).  Because Emacs's completion facility works with embedded
words you can see Unicode characters with names that include, for
example, "dotless" or "diagonal".  It's not quite "by knowing its glyph"
but it's helped me out many times.


-- 
Ben.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Other difference with Perl: Python scripts in a pipe

2016-03-10 Thread Fillmore

On 3/10/2016 4:46 PM, Ian Kelly wrote:

On Thu, Mar 10, 2016 at 2:33 PM, Fillmore  wrote:


when I put a Python script in pipe with other commands, it will refuse to
let go silently. Any way I can avoid this?


What is your script doing? I don't see this problem.

ikelly@queso:~ $ cat somescript.py
import sys

for i in range(20):
 sys.stdout.write('line %d\n' % i)


you are right. it's the with block :(

import sys
import csv

with open("somefile.tsv", newline='') as csvfile:

myReader = csv.reader(csvfile, delimiter='\t')
for row in myReader:

for i in range(20):
sys.stdout.write('line %d\n' % i)

$ ./somescript.py | head -5
line 0
line 1
line 2
line 3
line 4
Traceback (most recent call last):
  File "./somescript.py", line 12, in 
sys.stdout.write('line %d\n' % i)
BrokenPipeError: [Errno 32] Broken pipe
Exception ignored in: <_io.TextIOWrapper name='' mode='w' 
encoding='UTF-8'>

BrokenPipeError: [Errno 32] Broken pipe

--
https://mail.python.org/mailman/listinfo/python-list


Re: Other difference with Perl: Python scripts in a pipe

2016-03-10 Thread Peter Otten
Ian Kelly wrote:

> On Thu, Mar 10, 2016 at 2:33 PM, Fillmore 
> wrote:
>>
>> when I put a Python script in pipe with other commands, it will refuse to
>> let go silently. Any way I can avoid this?
> 
> What is your script doing? I don't see this problem.
> 
> ikelly@queso:~ $ cat somescript.py
> import sys
> 
> for i in range(20):
> sys.stdout.write('line %d\n' % i)
> ikelly@queso:~ $ python somescript.py | head -5
> line 0
> line 1
> line 2
> line 3
> line 4
> ikelly@queso:~ $ python3 somescript.py | head -5
> line 0
> line 1
> line 2
> line 3
> line 4

I suppose you need to fill the OS-level cache:

$ cat somescript.py 
import sys

for i in range(int(sys.argv[1])):
sys.stdout.write('line %d\n' % i)
$ python somescript.py 20 | head -n5
line 0
line 1
line 2
line 3
line 4
$ python somescript.py 200 | head -n5
line 0
line 1
line 2
line 3
line 4
$ python somescript.py 2000 | head -n5
line 0
line 1
line 2
line 3
line 4
Traceback (most recent call last):
  File "somescript.py", line 4, in 
sys.stdout.write('line %d\n' % i)
IOError: [Errno 32] Broken pipe

During my experiments I even got

close failed in file object destructor:
sys.excepthook is missing
lost sys.stderr

occasionally.


-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Other difference with Perl: Python scripts in a pipe

2016-03-10 Thread Ian Kelly
On Thu, Mar 10, 2016 at 3:09 PM, Peter Otten <__pete...@web.de> wrote:
> I suppose you need to fill the OS-level cache:
>
> $ cat somescript.py
> import sys
>
> for i in range(int(sys.argv[1])):
> sys.stdout.write('line %d\n' % i)
> $ python somescript.py 20 | head -n5
> line 0
> line 1
> line 2
> line 3
> line 4
> $ python somescript.py 200 | head -n5
> line 0
> line 1
> line 2
> line 3
> line 4
> $ python somescript.py 2000 | head -n5
> line 0
> line 1
> line 2
> line 3
> line 4
> Traceback (most recent call last):
>   File "somescript.py", line 4, in 
> sys.stdout.write('line %d\n' % i)
> IOError: [Errno 32] Broken pipe
>
> During my experiments I even got
>
> close failed in file object destructor:
> sys.excepthook is missing
> lost sys.stderr
>
> occasionally.

Interesting, both of these are probably worth bringing up as issues on
the bugs.python.org tracker. I'm not sure that the behavior should be
changed (if we get an error, we shouldn't just swallow it) but it does
seem like a significant hassle for writing command-line
text-processing tools.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Text input with keyboard, via input methods

2016-03-10 Thread Marko Rauhamaa
Ben Bacarisse :

> Marko Rauhamaa  writes:
>>  * I should be able to get the character by knowing its glyph
>>(shape).
>>
>>  * It should be very low-level and work system-wide, preferably over
>>the network (I'm typing this over the network).
>
> I think you are a Gnus user so you probably already know about
> insert-char (usually bound to C-x 8 RET though I've re-bound it to my
> "insert" key). Because Emacs's completion facility works with embedded
> words you can see Unicode characters with names that include, for
> example, "dotless" or "diagonal". It's not quite "by knowing its
> glyph" but it's helped me out many times.

At least it works over the network.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Simple exercise

2016-03-10 Thread Gregory Ewing

Rodrick Brown wrote:

  if m.group(1) not in od.keys():
od[m.group(1)] = int(m.group(2))
  else:
od[m.group(1)] += int(od.get(m.group(1),0))


Others have pointed out what's wrong with this, but here's
a general tip: Don't repeat complicated subexpressions
such as m.group(1). Doing so makes the code hard to read
and therefore hard to spot errors in (and less efficient
as well, although that's a secondary consideration).

Instead, pull them out and give them meaningful names.
Doing so with the above code gives:

  name = m.group(1)
  value = m.group(2)
  if name not in od.keys():
od[name] = int(value)
  else:
od[name] += int(od.get(name, 0))

Now it's a lot eaier to see that you haven't used the
value anywhere in the second case, which should alert
you that something isn't right.

--
Greg
--
https://mail.python.org/mailman/listinfo/python-list


Re: Simple exercise

2016-03-10 Thread Chris Angelico
On Fri, Mar 11, 2016 at 10:24 AM, Gregory Ewing
 wrote:
> Instead, pull them out and give them meaningful names.
> Doing so with the above code gives:
>
>   name = m.group(1)
>   value = m.group(2)
>   if name not in od.keys():
> od[name] = int(value)
>   else:
> od[name] += int(od.get(name, 0))
>
> Now it's a lot eaier to see that you haven't used the
> value anywhere in the second case, which should alert
> you that something isn't right.

Although in this case, the code is majorly redundant - and could be
replaced entirely with a defaultdict(int).

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Other difference with Perl: Python scripts in a pipe

2016-03-10 Thread Fillmore

On 3/10/2016 5:16 PM, Ian Kelly wrote:


Interesting, both of these are probably worth bringing up as issues on
the bugs.python.org tracker. I'm not sure that the behavior should be
changed (if we get an error, we shouldn't just swallow it) but it does
seem like a significant hassle for writing command-line
text-processing tools.


is it possible that I am the first one encountering this kind of issues?




--
https://mail.python.org/mailman/listinfo/python-list


Re: Other difference with Perl: Python scripts in a pipe

2016-03-10 Thread INADA Naoki
On Fri, Mar 11, 2016 at 8:48 AM, Fillmore 
wrote:

> On 3/10/2016 5:16 PM, Ian Kelly wrote:
>
>>
>> Interesting, both of these are probably worth bringing up as issues on
>> the bugs.python.org tracker. I'm not sure that the behavior should be
>> changed (if we get an error, we shouldn't just swallow it) but it does
>> seem like a significant hassle for writing command-line
>> text-processing tools.
>>
>
> is it possible that I am the first one encountering this kind of issues?
>
>
No.  I see it usually.

Python's zen says:

> Errors should never pass silently.
>Unless explicitly silenced.

When failed to write to stdout, Python should raise Exception.
You can silence explicitly when it's safe:

try:
print(...)
except BrokenPipeError:
os.exit(0)
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Simple exercise

2016-03-10 Thread BartC

On 10/03/2016 09:02, Rodrick Brown wrote:

From the following input


9
BANANA FRIES 12
POTATO CHIPS 30
APPLE JUICE 10
CANDY 5
APPLE JUICE 10
CANDY 5
CANDY 5
CANDY 5
POTATO CHIPS 30

I'm expecting the following output
BANANA FRIES 12
POTATO CHIPS 60
APPLE JUICE 20
CANDY 20



Here's a rather un-Pythonic and clunky version. But it gives the 
expected results. (I've dispensed with file input, but that can easily 
be added back.)


def last(a):
return a[-1]

def init(a): # all except last element
return a[0:len(a)-1]

data =["BANANA FRIES 12",# 1+ items/line, last must be numeric
   "POTATO CHIPS 30",
   "APPLE JUICE 10",
   "CANDY 5",
   "APPLE JUICE 10",
   "CANDY 5",
   "CANDY 5",
   "CANDY 5",
   "POTATO CHIPS 30"]

names  = []# serve as key/value sets
totals = []

for line in data:  # banana fries 12
parts = line.split(" ")# ['banana','fries','12']
value = int(last(parts))   # 12
name  =  " ".join(init(parts)) # 'banana fries'

try:
n = names.index(name)  # update existing entry
totals[n] += value
except:
names.append(name) # new entry
totals.append(value)

for i in range(len(names)):
print (names[i],totals[i])


--
Bartc
--
https://mail.python.org/mailman/listinfo/python-list


non printable (moving away from Perl)

2016-03-10 Thread Fillmore


Here's another handy Perl regex which I am not sure how to translate to 
Python.


I use it to avoid processing lines that contain funny chars...

if ($string =~ /[^[:print:]]/) {next OUTER;}

:)

--
https://mail.python.org/mailman/listinfo/python-list


Re: non printable (moving away from Perl)

2016-03-10 Thread Ian Kelly
On Mar 10, 2016 5:15 PM, "Fillmore"  wrote:
>
>
> Here's another handy Perl regex which I am not sure how to translate to
Python.
>
> I use it to avoid processing lines that contain funny chars...
>
> if ($string =~ /[^[:print:]]/) {next OUTER;}

Python's re module doesn't support POSIX character classes, but the regex
module on PyPI does.

https://pypi.python.org/pypi/regex
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Other difference with Perl: Python scripts in a pipe

2016-03-10 Thread Fillmore

On 3/10/2016 7:08 PM, INADA Naoki wrote:


No.  I see it usually.

Python's zen says:


Errors should never pass silently.
Unless explicitly silenced.


When failed to write to stdout, Python should raise Exception.
You can silence explicitly when it's safe:

try:
 print(...)
except BrokenPipeError:
 os.exit(0)



I don't like it. It makes Python not so good for command-line utilities


--
https://mail.python.org/mailman/listinfo/python-list


Re: Encapsulation in Python

2016-03-10 Thread Rick Johnson
On Thursday, March 10, 2016 at 9:28:06 AM UTC-6, Ian wrote: 
> Encapsulation in Python is based on trust rather than the
> authoritarian style of C++ / Java. The maxim in the Python
> community is that "we're all consenting adults". If I
> don't intend an attribute to be messed with, then I'll
> mark it with a leading underscore. If you mess with it
> anyway, and later your code breaks as a result of it,
> that's your problem, not mine. :-)

It is a strange irony that you cannot escape the
encapsulating nature of Python modules (since they are
formed by Python from your source files), but Python freely
allows us to ignore encapsulation in our OOP paradigm...
Hmm, sure, many could argue that Python's "mandatory
modules" are simply a result of convenience, they will say:
"Since we have to write code in source files anyway, why not
utilize the "encapsulation of the source file itself to
define module scope?".

I have witnessed the mayhem that occurs when a language does
not mandate module encapsulation (Ruby, i'm looking directly
at you), and while i agree with the Python designers
that modules must *ALWAYS* be mandatory, i am not convinced
that module space should be so strictly confined to source
files.

Many times, i would have preferred to define my module space
across multiple files, multiple files that could share state
without resorting to the yoga-style "import contortions",
and/or the dreaded "circular import nightmares" that plague
our community today.

In one way, Python got it right by forcing us to encapsulate
our code into modules, however, it failed by not allowing us
to define both the breadth *AND* width of that encapsulation.
Which brings us up against the brutal reality that: Whilst
python's sycophants love to parrot-off about how we are all
"Adults", it's implementation still attempts to treat us
ignorant little children who are incapable of defining our
own module space.

> The vast majority of getters and setters do nothing other
> than get/set the field they belong to. They exist only to
> allow the *possibility* of doing something else at some
> point far in the future.

But you're ignoring the most important aspect of
getters/setters, and that is, that they expose an interface.
An interface that must be *EXPLICITLY* created. Good
interfaces *NEVER* happen by accident.

> That's a ton of undesirable boilerplate for little real
> benefit.

I understand the aversion to boilerplate, but most languages
have simplified the creation of getters/setters to the point
that your lament is unfounded. And i would argue that the
benefits of creating rigid interfaces is the gift that
keeps on giving.

> In Python, OO designers are able to get away with not
> using getters and setters because we have properties. You
> can start with an attribute, and if you later want to
> change the means of getting and setting it, you just
> replace it with a property. The property lets you add any
> logic you want, and as far as the outside world is
> concerned, it still just looks like an attribute.

I used properties quite often when i first began writing
Python code. Heck, i thought they were the best thing since
sliced bread. But now, i have come to hate them. I never use
them in any new code, and when i have free time, i strip
them out of old code. When i am forced to use an interface
that was written with properties, i find that learning the
interface is more difficult

  (1) In a dir listing, one cannot determine which symbols
  are properties, which are attributes, and which are
  methods. The beauty of pure OOP encapsulation, is that,
  *EVERY* exposed symbol is a method. In pure OOP, i don't
  have to wonder if i'm calling a function with no
  parameters, or accessing a property, or accessing an
  attribute, no, i'll know that every access is through a
  method, therefore, i will append "()" when i know the
  method takes no arguments. Consistency is very important.
  
  (2) Properties and attributes encourage vague naming
  schemes. When i read code, i find the code more
  comprehensible when the symbols give me clues as to what
  is going on. So if i read code like: `re.groups`,
  befuddlement sets in. What is "groups"? A function
  object? An attribute? "getCapturingGroupCount" would be a
  better name (but that's semantics) In pure OOP,  methods
  are the only communication meduim, so we're more likely to
  write "getBlah" and "setBlah", and use verbs for
  procedural names -- these naming conventions are more
  comprehensible to the user.

  (3) Not all authors correctly utilize leading underscores
  to differentiate between public and private attributes.
  
> This all boils down to the fact that code inside a method
> has no special privilege over external code. If you could
> hide data so well that external code really couldn't
> access it, then you wouldn't be able to access it either.

That's a ridiculous statement Ian.

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: A mistake which almost went me mad

2016-03-10 Thread Rick Johnson
On Thursday, March 10, 2016 at 12:13:39 AM UTC-6, Rustom Mody wrote: 
> As usual Rick I find myself agreeing with your direction [also it seems
> Random832's direction]

Somehow i missed Random's remark... Hmm, he does have a good idea! Introducing 
a new "import statement" would not break anything in the way that rearranging 
the stdlib structure would. I think it's a great idea, and *LONG* overdue.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Simple exercise

2016-03-10 Thread Mark Lawrence

On 11/03/2016 00:05, BartC wrote:

On 10/03/2016 09:02, Rodrick Brown wrote:

From the following input


9
BANANA FRIES 12
POTATO CHIPS 30
APPLE JUICE 10
CANDY 5
APPLE JUICE 10
CANDY 5
CANDY 5
CANDY 5
POTATO CHIPS 30

I'm expecting the following output
BANANA FRIES 12
POTATO CHIPS 60
APPLE JUICE 20
CANDY 20



Here's a rather un-Pythonic and clunky version. But it gives the
expected results. (I've dispensed with file input, but that can easily
be added back.)

def last(a):
 return a[-1]

def init(a): # all except last element
 return a[0:len(a)-1]


What is wrong with a[0:1] ?



data =["BANANA FRIES 12",# 1+ items/line, last must be numeric
"POTATO CHIPS 30",
"APPLE JUICE 10",
"CANDY 5",
"APPLE JUICE 10",
"CANDY 5",
"CANDY 5",
"CANDY 5",
"POTATO CHIPS 30"]

names  = []# serve as key/value sets
totals = []

for line in data:  # banana fries 12
 parts = line.split(" ")# ['banana','fries','12']
 value = int(last(parts))   # 12
 name  =  " ".join(init(parts)) # 'banana fries'

 try:
 n = names.index(name)  # update existing entry
 totals[n] += value
 except:


Never use a bare except.  Better still, use an appropriate collection 
rather than two lists.  Off of the top of my head a counter or a 
defaultdict.



 names.append(name) # new entry
 totals.append(value)

for i in range(len(names)):
 print (names[i],totals[i])



Always a code smell when range() and len() are combined.

--
My fellow Pythonistas, ask not what our language can do for you, ask
what you can do for our language.

Mark Lawrence

--
https://mail.python.org/mailman/listinfo/python-list


Re: non printable (moving away from Perl)

2016-03-10 Thread Mark Lawrence

On 11/03/2016 00:25, Ian Kelly wrote:

On Mar 10, 2016 5:15 PM, "Fillmore"  wrote:



Here's another handy Perl regex which I am not sure how to translate to

Python.


I use it to avoid processing lines that contain funny chars...

if ($string =~ /[^[:print:]]/) {next OUTER;}


Python's re module doesn't support POSIX character classes, but the regex
module on PyPI does.

https://pypi.python.org/pypi/regex



There are plenty of testers for the re module, but do you know if there 
are any available for the above, as it's not the easiest thing to search 
for?


--
My fellow Pythonistas, ask not what our language can do for you, ask
what you can do for our language.

Mark Lawrence

--
https://mail.python.org/mailman/listinfo/python-list


Re: Simple exercise

2016-03-10 Thread BartC

On 11/03/2016 01:21, Mark Lawrence wrote:

On 11/03/2016 00:05, BartC wrote:



def last(a):
 return a[-1]

def init(a): # all except last element
 return a[0:len(a)-1]


What is wrong with a[0:1] ?


The returns the head of the list. I need everything except the last 
element ('init' is from Haskell).



for i in range(len(names)):
 print (names[i],totals[i])



Always a code smell when range() and len() are combined.


Any other way of traversing two lists in parallel?

--
Bartc


--
https://mail.python.org/mailman/listinfo/python-list


Re: Simple exercise

2016-03-10 Thread Larry Martell
On Thu, Mar 10, 2016 at 8:45 PM, BartC  wrote:
> Any other way of traversing two lists in parallel?

zip
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Simple exercise

2016-03-10 Thread Martin A. Brown

>>> for i in range(len(names)):
>>> print (names[i],totals[i])
>>
>> Always a code smell when range() and len() are combined.
>
> Any other way of traversing two lists in parallel?

Yes.  Builtin function called 'zip'.

  https://docs.python.org/3/library/functions.html#zip

Toy example:

  import string
  alpha = string.ascii_lowercase
  nums = range(len(alpha))
  for N, A in zip(nums, alpha):
  print(N, A)

Good luck,

-Martin

-- 
Martin A. Brown
http://linux-ip.net/
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Simple exercise

2016-03-10 Thread Mark Lawrence

On 11/03/2016 01:45, BartC wrote:

On 11/03/2016 01:21, Mark Lawrence wrote:

On 11/03/2016 00:05, BartC wrote:



def last(a):
 return a[-1]

def init(a): # all except last element
 return a[0:len(a)-1]


What is wrong with a[0:1] ?


The returns the head of the list. I need everything except the last
element ('init' is from Haskell).


I missed out one character, it should of course have been:-

a[0:-1]




for i in range(len(names)):
 print (names[i],totals[i])



Always a code smell when range() and len() are combined.


Any other way of traversing two lists in parallel?



Use zip(), but as I suggested in my earlier reply there are better data 
structures than two lists in parallel for this problem.


--
My fellow Pythonistas, ask not what our language can do for you, ask
what you can do for our language.

Mark Lawrence

--
https://mail.python.org/mailman/listinfo/python-list


Re: Simple exercise

2016-03-10 Thread Mark Lawrence

On 11/03/2016 01:56, Martin A. Brown wrote:



for i in range(len(names)):
 print (names[i],totals[i])


Always a code smell when range() and len() are combined.


Any other way of traversing two lists in parallel?


Yes.  Builtin function called 'zip'.

   https://docs.python.org/3/library/functions.html#zip

Toy example:

   import string
   alpha = string.ascii_lowercase
   nums = range(len(alpha))
   for N, A in zip(nums, alpha):
   print(N, A)

Good luck,

-Martin



Which would usually be written for N, A in enumerate(alpha):

--
My fellow Pythonistas, ask not what our language can do for you, ask
what you can do for our language.

Mark Lawrence

--
https://mail.python.org/mailman/listinfo/python-list


Re: Simple exercise

2016-03-10 Thread Chris Kaynor
On Thu, Mar 10, 2016 at 4:05 PM, BartC  wrote:

> Here's a rather un-Pythonic and clunky version. But it gives the expected
> results. (I've dispensed with file input, but that can easily be added
> back.)
>
> def last(a):
> return a[-1]
>
> def init(a): # all except last element
> return a[0:len(a)-1]
>
> data =["BANANA FRIES 12",# 1+ items/line, last must be numeric
>"POTATO CHIPS 30",
>"APPLE JUICE 10",
>"CANDY 5",
>"APPLE JUICE 10",
>"CANDY 5",
>"CANDY 5",
>"CANDY 5",
>"POTATO CHIPS 30"]
>
> names  = []# serve as key/value sets
> totals = []
>
> for line in data:  # banana fries 12
> parts = line.split(" ")# ['banana','fries','12']
> value = int(last(parts))   # 12
> name  =  " ".join(init(parts)) # 'banana fries'
>

This could be written as (untested):

name, value = line.rsplit(' ', 1) # line.rsplit(maxsplit=1) should also work
value = int(value)


No need to rejoin the string this way.

See also: https://docs.python.org/3.5/library/stdtypes.html#str.rsplit


> try:
> n = names.index(name)  # update existing entry
> totals[n] += value
> except:
> names.append(name) # new entry
> totals.append(value)
>
> for i in range(len(names)):
> print (names[i],totals[i])
>

Chris
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Simple exercise

2016-03-10 Thread BartC

On 11/03/2016 02:03, Mark Lawrence wrote:

On 11/03/2016 01:45, BartC wrote:

On 11/03/2016 01:21, Mark Lawrence wrote:

On 11/03/2016 00:05, BartC wrote:



def last(a):
 return a[-1]

def init(a): # all except last element
 return a[0:len(a)-1]


What is wrong with a[0:1] ?


The returns the head of the list. I need everything except the last
element ('init' is from Haskell).


I missed out one character, it should of course have been:-

a[0:-1]


I tried that, but I must have got something wrong.

--
Bartc
--
https://mail.python.org/mailman/listinfo/python-list


Re: non printable (moving away from Perl)

2016-03-10 Thread Ian Kelly
On Mar 10, 2016 6:33 PM, "Mark Lawrence"  wrote:
>
> On 11/03/2016 00:25, Ian Kelly wrote:
>>
>> On Mar 10, 2016 5:15 PM, "Fillmore"  wrote:
>>>
>>>
>>>
>>> Here's another handy Perl regex which I am not sure how to translate to
>>
>> Python.
>>>
>>>
>>> I use it to avoid processing lines that contain funny chars...
>>>
>>> if ($string =~ /[^[:print:]]/) {next OUTER;}
>>
>>
>> Python's re module doesn't support POSIX character classes, but the regex
>> module on PyPI does.
>>
>> https://pypi.python.org/pypi/regex
>>
>
> There are plenty of testers for the re module, but do you know if there
are any available for the above, as it's not the easiest thing to search
for?

No idea.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: context managers inline?

2016-03-10 Thread Steven D'Aprano
On Fri, 11 Mar 2016 05:33 am, Neal Becker wrote:

> Is there a way to ensure resource cleanup with a construct such as:
> 
> x = load (open ('my file', 'rb))
> 
> Is there a way to ensure this file gets closed?

Depends on what you mean by "ensure". Have load() call the file's close
method may be good enough.

If you want a better guarantee, you need either a with block:

with open(...) as f:
...


or a finally block:

try:
...
finally:
...


There is no expression-based version of these.



-- 
Steven

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Text input with keyboard, via input methods

2016-03-10 Thread Larry Hudson via Python-list

On 03/09/2016 11:54 PM, Rustom Mody wrote:
[...]

In between these two extremes we have many possibilities
- ibus/gchar etc
- compose key
- alternate keyboard layouts

Using all these levels judiciously seems to me a good idea...


FWIW -- in Mint Linux you can select the compose key with the following:

Preferences->Keyboard->Layouts->Options->Position of Compose Key

then check the box(s) you want.

For those unfamiliar with it, to use it you press your selected compose key followed by a 
sequence of characters (usually 2 but sometimes more).


A couple examples--
n~ gives ñ u" gives üoo gives °  (degree sign)
Here's a cute one:  CCCP gives ☭  (hammer & sickle)

This gives you (relatively) easy access to a large range of 'special' 
characters.

--
https://mail.python.org/mailman/listinfo/python-list


Re: Pyhon 2.x or 3.x, which is faster?

2016-03-10 Thread Steven D'Aprano
On Fri, 11 Mar 2016 07:07 am, Chris Angelico wrote:

> You _need_ [emphasis in original] to make sure
> that you're thinking about text as text, and that means being aware of
> RTL vs LTR, combining characters, case conversions, collations, etc,
> etc, etc, all in terms of Unicode rather than as eight-bit or
> seven-bit characters.

And I thought that I was a Unicode-evangelist...

You don't "need" to do anything of the sort, any more than (say) Firefox
needs to support displaying Scitex CT image files.

If you want to put people off Unicode and make them even more resistant, the
idea that there is no middle ground between "naive ASCII" and full,
complete, total and utterly 100% coverage of the entire Unicode standard
will do it nicely. Unicode covers a huge amount of ground, and most users
won't need more than a fraction of it. Especially people like Bart, who are
writing code for his own personal use.

If Bart gets to the point of being able to correctly read and write his
mostly ASCII text as UTF-8 files without moji-bake, that's probably more
than he'll personally ever need. Or not. Only he will tell.

> An intelligent Unicode-aware MUD client has to 
> not only cope with variable width 

That may be true, but that doesn't mean that there isn't still room in the
world for dumb, just-barely Unicode capable clients. And frankly I would
rather partial Unicode support than buggy Unicode support: I have a text
editor which would be my preferred editor of choice except it has an
annoying bug where it will (seemingly at random) switch to Right-To-Left
mode for no reason, and then be impossible to switch back. Since I have
*no* use for RTL, I would rather an editor that doesn't support that than
one that supports it buggily.



-- 
Steven

-- 
https://mail.python.org/mailman/listinfo/python-list


  1   2   >