Looking for advice

2018-04-20 Thread 20/20 Lab
Going to write my first python program that uses a database. Going to 
store 50-100 rows with 5-10 columns.  Which database / module would you 
advise me to use?  It's basically going to be processing order status 
emails for the sales staff.  Producing a webpage (2-3 times daily, as 
updates arrive) that has the sales staff orders and status on it.   I'm 
thinking just a simple sqlite, but dont want to waste time going down 
the wrong path.



Thank you for your time

--
https://mail.python.org/mailman/listinfo/python-list


Re: Easier way to do this?

2017-10-05 Thread 20/20 Lab



On 10/04/2017 05:11 PM, Irv Kalb wrote:

I'm assuming from your posts that you are not a student.  If that is the case, 
look at my solution below.


On Oct 4, 2017, at 9:42 AM, 20/20 Lab  wrote:

Looking for advice for what looks to me like clumsy code.

I have a large csv (effectively garbage) dump.  I have to pull out sales 
information per employee and count them by price range. I've got my code 
working, but I'm thinking there must be a more refined way of doing this.

---snippet of what I have---

EMP1 = [0,0]
EMP2 = [0,0]
EMP3 = [0,0]

for line in (inputfile):
 content = line.split(",")
 if content[18] == "EMP1":
 if float(content[24]) < 99.75:
 EMP1[0] += 1
 elif float(content[24]) > 99.74:
 EMP1[1] += 1
 if content[18] == "EMP2":
 if float(content[24]) < 99.75:
 EMP2[0] += 1
 elif float(content[24]) > 99.74:
 EMP2[1] += 1
 if content[18] == "EMP3":
 if float(content[24]) < 99.75:
 EMP3[0] += 1
 elif float(content[24]) > 99.74:
 EMP3[1] += 1

and repeat if statements for the rest of 25+ employees.  I can make a list of the 
employees, but I'd prefer to pull them from the csv, as our turnover is rather high 
(however this is not important).  I'm thinking another "for employee in 
content[18]" should be there, but when I tried, my numbers were incorrect.

Any help / advice is appreciated,

Matt



You could certainly use the csv module if you want, but this builds on your 
start of dealing with the data line by line.

Completely untested, but this approach works by building a dictionary on the 
fly from your data.  Each key is an employee name.  The data associated with 
each key is a two item list of counts.


# Constants
NAME_INDEX = 18
SALES_INDEX = 24
THRESHHOLD = 99.75

salesCountDict = {}  # start with an empty dict

for line in (inputfile):
 content = line.split(",")  # split the line into a list
 name = content[NAME_INDEX]  # extract the name from the content list

 # If we have not seen this employee name before, add it to the dictionary
 # like key value pair: '': [0, 0]
 if not(name in employeeDataDict):
 salesCountDict[name] = [0, 0]

 price = float(content[SALES_INDEX])  # extract the price

# If the price is under some threshhold, increment one value in the 
associated sales list
# otherwise increment the other
 if price < THRESHHOLD:
 salesCountDict[name][0] += 1
 else:
 salesCountDict[name][1] += 1
 


# Now you should have a dictionary.  Do what you want with it.  For example:

for name in salesCountDict:
 salesList = salesCountDict[name]
 print(name, salesList)# Assuming Python 3

 
 



Thanks for this.  I've recently had a hard time discerning when to use / 
the differences of the dict, list, set, tuple.  So this is a huge help.

--
https://mail.python.org/mailman/listinfo/python-list


Re: Easier way to do this?

2017-10-05 Thread 20/20 Lab



On 10/04/2017 04:48 PM, Dennis Lee Bieber wrote:

On Wed, 4 Oct 2017 09:42:18 -0700, 20/20 Lab  declaimed
the following:

Well -- since your later post implies this is not some "homework
assignment"...


Looking for advice for what looks to me like clumsy code.

EMP1 = [0,0]
EMP2 = [0,0]
EMP3 = [0,0]


EEEK! Don't do that! Especially as...


for line in (inputfile):
     content = line.split(",")

You've already been told to use the CSV module, since it should handle
tricky cases (quoted strings with embedded commas, say).


     if content[18] == "EMP1":

... the name is part of the input data. Use a data structure in which the
name is part of the data -- like a dictionary.


     if float(content[24]) < 99.75:
     EMP1[0] += 1
     elif float(content[24]) > 99.74:
     EMP1[1] += 1

Pardon? Floating point numbers are not exact... It is possible that
some entry, in floating binary, is really between 99.75 and 99.74, so which
should it be counted as? At the least, just use an else: for the other
case.



employees = {}
for row in csvfile: #pseudo-code for however you read each row of data

emp = employees.get(content[18], [0, 0])
if float(content[24]) < 99.75
emp[0] += 1
else:
emp[1] += 1
employees[content[18]] = emp



and repeat if statements for the rest of 25+ employees.  I can make a
list of the employees, but I'd prefer to pull them from the csv, as our
turnover is rather high (however this is not important).  I'm thinking
another "for employee in content[18]" should be there, but when I tried,
my numbers were incorrect.

Actually -- I'd be more likely to load the data into an SQLite3
database, and use SQL queries to produce the above summary report. I'd have
to experiment with subselects to get the > and < sets, and then use a
count() and groupby to put them in order.
Thanks!  I knew there was an more refined way to do this.  I'm still 
learning, so this is a huge help.
The data actually already comes from a database, but doesnt allow me to 
produce a summary report, just a list of of each item.  Which I can then 
export to pdf, or the csv that I'm working with.  The developers have a 
ticket for a feature request, but they've had it there for over five 
years and I dont see them implementing it anytime soon.

--
https://mail.python.org/mailman/listinfo/python-list


Re: Easier way to do this?

2017-10-05 Thread 20/20 Lab

On 10/05/2017 07:28 AM, Neil Cerutti wrote:

On 2017-10-04, 20/20 Lab  wrote:

It's not quite a 'learning exercise', but I learn on my own if
I treat it as such.  This is just to cut down a few hours of
time for me every week filtering the file by hand for the
office manager.

That looks like a 30-second job using a pivot table in Excel.
Office manager, learn thy Excel!

On the other hand, I think Python's csv module is a killer app,
so I do recommend taking the opportunity to learn csv.DictReader
and csv.DictWriter for your own enjoyment.

It would be if our practice management software would export to excel, 
or even a realistic csv.  Problem is that it only exports to csv and 
it's 80-90% garbage and redundant information.  I've taken to bringing 
the csv into excel and refining it so I can do that, but again.  I'd 
rather take half a second and have a program do it for me.  ;)

--
https://mail.python.org/mailman/listinfo/python-list


Re: Easier way to do this?

2017-10-04 Thread 20/20 Lab



On 10/04/2017 01:55 PM, Ben Bacarisse wrote:

20/20 Lab  writes:


Looking for advice for what looks to me like clumsy code.

I have a large csv (effectively garbage) dump.  I have to pull out
sales information per employee and count them by price range. I've got
my code working, but I'm thinking there must be a more refined way of
doing this.

I second the suggestion to use the CSV module.  It's very simple to use.


---snippet of what I have---

EMP1 = [0,0]
EMP2 = [0,0]
EMP3 = [0,0]

for line in (inputfile):
     content = line.split(",")
     if content[18] == "EMP1":
     if float(content[24]) < 99.75:
     EMP1[0] += 1
     elif float(content[24]) > 99.74:
     EMP1[1] += 1
     if content[18] == "EMP2":
     if float(content[24]) < 99.75:
     EMP2[0] += 1
     elif float(content[24]) > 99.74:
     EMP2[1] += 1
     if content[18] == "EMP3":
     if float(content[24]) < 99.75:
     EMP3[0] += 1
     elif float(content[24]) > 99.74:
     EMP3[1] += 1

and repeat if statements for the rest of 25+ employees.

Eek!  When you have named objects selected using a string that is the
object's name you know you want a dict.  You'd have a single dict for
all employees, keyed by the tag in field 18 of the file.  Does that
help?

I'm deliberately not saying more because this looks like a learning
exercise and you probably want to do most of it yourself.




Looks like I'll be going with the CSV module.  Should trim it up nicely.

It's not quite a 'learning exercise', but I learn on my own if I treat 
it as such.  This is just to cut down a few hours of time for me every 
week filtering the file by hand for the office manager.


Thanks for the pointers,
Matt
--
https://mail.python.org/mailman/listinfo/python-list


Re: Easier way to do this?

2017-10-04 Thread 20/20 Lab


On 10/04/2017 12:47 PM, breamore...@gmail.com wrote:

On Wednesday, October 4, 2017 at 8:29:26 PM UTC+1, 20/20 Lab wrote:



Any help / advice is appreciated,

Matt

Use the csv module https://docs.python.org/3/library/csv.html to read the file 
with a Counter 
https://docs.python.org/3/library/collections.html#collections.Counter.  I'm 
sorry but I'm too knackered to try writing the code for you :-(

--
Kindest regards.

Mark Lawrence.
This looks to be exactly what I want.  I'll get to reading.  Thank you 
very much.


Matt
--
https://mail.python.org/mailman/listinfo/python-list


Easier way to do this?

2017-10-04 Thread 20/20 Lab

Looking for advice for what looks to me like clumsy code.

I have a large csv (effectively garbage) dump.  I have to pull out sales 
information per employee and count them by price range. I've got my code 
working, but I'm thinking there must be a more refined way of doing this.


---snippet of what I have---

EMP1 = [0,0]
EMP2 = [0,0]
EMP3 = [0,0]

for line in (inputfile):
    content = line.split(",")
    if content[18] == "EMP1":
    if float(content[24]) < 99.75:
    EMP1[0] += 1
    elif float(content[24]) > 99.74:
    EMP1[1] += 1
    if content[18] == "EMP2":
    if float(content[24]) < 99.75:
    EMP2[0] += 1
    elif float(content[24]) > 99.74:
    EMP2[1] += 1
    if content[18] == "EMP3":
    if float(content[24]) < 99.75:
    EMP3[0] += 1
    elif float(content[24]) > 99.74:
    EMP3[1] += 1

and repeat if statements for the rest of 25+ employees.  I can make a 
list of the employees, but I'd prefer to pull them from the csv, as our 
turnover is rather high (however this is not important).  I'm thinking 
another "for employee in content[18]" should be there, but when I tried, 
my numbers were incorrect.


Any help / advice is appreciated,

Matt

--
https://mail.python.org/mailman/listinfo/python-list


Re: If you are running 32-bit 3.6 on Windows, please test this

2017-08-31 Thread 20/20 Lab



On 08/31/2017 01:53 AM, Pavol Lisy wrote:

On 8/31/17, Terry Reedy  wrote:

On 8/30/2017 1:35 PM, Terry Reedy wrote:

https://stackoverflow.com/questions/45965545/math-sqrt-domain-error-when-square-rooting-a-positive-number



reports the following:
-
Microsoft Windows [Version 10.0.16251.1002]
(c) 2017 Microsoft Corporation. All rights reserved.

C:\Users\Adam>python
Python 3.6.2 (v3.6.2:5fd33b5, Jul  8 2017, 04:14:34) [MSC v.1900 32 bit
(Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
  >>> import math
  >>> math.sqrt(1.3)
Traceback (most recent call last):
   File "", line 1, in 
ValueError: math domain error
  >>>

I upgraded from version 3.6.1 to 3.6.2 to try to resolve the issue and
restarted my computer but it is still occurring. Some numbers are
working (1.2, 1.4) and some others are also not working (1.128).


Neither installed 64 bit 3.6.2 nor my repository 3.6 32-bit debug build
reproduce this.  If anyone has the python.org 32bit 3.6.1/2 releases
installed on Windows, please test and report.

Three people have reported that math.sqrt(1.3) works in 32 bit Python on
64-bit Windows and no one otherwise.  I reported back on SO that the
problem is likely local.  Thanks for the responses.

Problem is reported on win10 and I see just 2 tests on win7 (third is
maybe Terry's but on SO I don't see win version).

If I am not wrong (with analyze source code) sqrt is calling function
from "libm" which is some kind of msvcrt.dll on windows... (see
https://github.com/python/cpython/blob/a0ce375e10b50f7606cb86b072fed7d8cd574fe7/Modules/mathmodule.c#L1183
and  
https://github.com/python/cpython/blob/6f0eb93183519024cb360162bdd81b9faec97ba6/Lib/ctypes/util.py#L34
)

And with "MSC v. 1900 ..."  it seems that "alternative approaches"
(see here https://bugs.python.org/issue23606 ) are used.

So I would be cautious.

PS.
BTW on my ubuntu I got this:

from ctypes import cdll
print(cdll.LoadLibrary("libcrypt.so"))

print(cdll.LoadLibrary("libm.so"))
...
OSError: /usr/lib/x86_64-linux-gnu/libm.so: invalid ELF header

(same with distro's python3, python2 and anaconda 3.6.2)

So this test  ->
https://github.com/python/cpython/blob/6f0eb93183519024cb360162bdd81b9faec97ba6/Lib/ctypes/util.py#L328

has to crash on some environments (see for example:
https://github.com/scipy/scipy/pull/5416/files ). And it seems like
some test failures are just ignored...


Valid point, fired up a windows 10 machine and worked as well.

Python 3.6.2 (v3.6.2:5fd33b5, Jul 8 2017, 04:14:34) [MSC v.1900 32 bit 
(Intel)] on win32

Type "copyright", "credits" or "license()" for more information.
>>> import math
>>> math.sqrt(1.3)
1.140175425099138
>>>

This machine does not have the creators update yet.  So there's that.
--
https://mail.python.org/mailman/listinfo/python-list


Re: If you are running 32-bit 3.6 on Windows, please test this

2017-08-30 Thread 20/20 Lab

On 08/30/2017 10:35 AM, Terry Reedy wrote:
https://stackoverflow.com/questions/45965545/math-sqrt-domain-error-when-square-rooting-a-positive-number 



reports the following:
-
Microsoft Windows [Version 10.0.16251.1002]
(c) 2017 Microsoft Corporation. All rights reserved.

C:\Users\Adam>python
Python 3.6.2 (v3.6.2:5fd33b5, Jul  8 2017, 04:14:34) [MSC v.1900 32 
bit (Intel)] on win32

Type "help", "copyright", "credits" or "license" for more information.
>>> import math
>>> math.sqrt(1.3)
Traceback (most recent call last):
 File "", line 1, in 
ValueError: math domain error
>>>

I upgraded from version 3.6.1 to 3.6.2 to try to resolve the issue and 
restarted my computer but it is still occurring. Some numbers are 
working (1.2, 1.4) and some others are also not working (1.128).



Neither installed 64 bit 3.6.2 nor my repository 3.6 32-bit debug 
build reproduce this.  If anyone has the python.org 32bit 3.6.1/2 
releases installed on Windows, please test and report.




Works here.

Python 3.6.2 (v3.6.2:5fd33b5, Jul 8 2017, 04:14:34) [MSC v.1900 32 bit 
(Intel)] on win32

Type "copyright", "credits" or "license()" for more information.
>>> import math
>>> math.sqrt(1.3)
1.140175425099138
>>>

Fresh windows7 x64 install with python 32bit

(My apologies to Terry for "Reply" instead of "Reply to List")

-Matt
--
https://mail.python.org/mailman/listinfo/python-list


Re: os.getlogin() Error

2017-05-05 Thread 20/20 Lab
I'm not sure if this will help you, but I found some stuff on accident 
looking at something related.


Not sure if it will help, but looked promising

https://github.com/parmentelat/apssh/issues/1

==Some snippets from the page

From the os.getlogin() docs: "Returns the user logged in to the 
controlling terminal of the process." Your script does not have a 
controlling terminal when run from cron. The docs go on to suggest: "For 
most purposes, it is more useful to use the environment variable LOGNAME 
to find out who the user is, or pwd.getpwuid(os.getuid())[0] to get the 
login name of the currently effective user id."



I suggest you to replace os.getlogin with:

import pwd
import os

getlogin = lambda: pwd.getpwuid(os.getuid())[0]
default_username = getlogin()

==


On 05/04/2017 01:03 PM, Wildman via Python-list wrote:

I wrote a Linux only GUI program using Tk that reports various system
information using a tabbed Notebook.  I have tested the program on
Debian, SoldyX and MX-15 and the program runs perfectly.

I tried testing on Mint and Ubuntu and the program would crash.  The
GUI would appear briefly and disappear.  On Ubuntu a crash report was
created so I was able to figure out what was going on.  It had the
traceback and showed that os.getlogin threw an error.  This is from
the crash report:

PythonArgs: ['/opt/linfo-tk/linfo-tk.py']
Traceback:
  Traceback (most recent call last):
File "/opt/linfo-tk/linfo-tk.py", line 1685, in 
  app = Window(root)
File "/opt/linfo-tk/linfo-tk.py", line 1393, in __init__
  txt = function()
File "/opt/linfo-tk/linfo-tk.py", line 316, in userinfo
  user = os.getlogin()
  OSError: [Errno 25] Inappropriate ioctl for device

The program installs using the Debian package system (.deb) and an
entry is created in the Applications Menu.  The strange thing is
that the crash only occurs when the program is run from the menu.
If I open a terminal and run the program from there, the program
runs fine.

I found a little info on the web about this but it was not clear
whether it is a bug in Linux or a bug in the os module.  I also
found a couple of work-arounds but neither of them will work for
my purposes.

 user = pwd.getpwuid(os.getuid())[0]
 user = getpass.getuser()

I will try to explain...
The program reports system information based on the user's name.
Things such as passwd, groups and shadow info.  However, the
program must have elevated privileges to get the shadow info so
the program has the option to 'restart as root' so the shadow
information will be obtainable.

If the program is restarting as root, the work-arounds report
the user as 'root'.  Then the system information for passwd,
groups and shadow will be reported for 'root' and not the
user that ran the program.  The actual user name that ran
the program is needed for the program to report correct
information.

It seems that only os.getlogin() reports the true user name no
matter if the program is run as a normal user or restarted as
root.

Is there a way to get the actual user name or is there a fix
or a better work-around for the os.getlogin() function?



--
https://mail.python.org/mailman/listinfo/python-list


Re: OT: Anyone here use the ConEmu console app?

2016-04-11 Thread 20/20 Lab

win+alt+space does not work?  ctrl+alt+win+space?

http://conemu.github.io/en/KeyboardShortcuts.html

Says those are not configurable, so they should work.

On 04/11/2016 02:49 PM, DFS wrote:
I turned on the Quake-style option (and auto-hide when it loses focus) 
and it disappeared and I can't figure out how to get it back onscreen. 
I think there's a keystroke combo (like Win+key) but I don't know what 
it is.


It shows in the Task Manager Processses, but not in the Alt+Tab list.

Uninstalled and reinstalled and now it launches Quake-style and 
hidden.  Looked everywhere (\Users\AppData\Local, Registry) for 
leftover settings file but couldn't find it.


Here's the screen where you make the Quake-style setting.
https://conemu.github.io/en/SettingsAppearance.html



Thanks


--
https://mail.python.org/mailman/listinfo/python-list


Re: Linux users: please run gui tests

2015-08-10 Thread 20/20 Lab



On 08/06/2015 07:07 PM, Terry Reedy wrote:
Python has an extensive test suite run after each 'batch' of commits 
on a variety of buildbots.  However, the Linux buildbots all (AFAIK) 
run 'headless', with gui's disabled.  Hence the following

test_tk test_ttk_guionly test_idle
(and on 3.5, test_tix, but not important)
are skipped either in whole or in part.

We are planning on adding the use of tkinter.ttk to Idle after the 
3.5.0 release, but a couple of other core developers have expressed 
concern about the reliability of tkinter.ttk on Linux.


There is also an unresolved issue where test_ttk hung on Ubuntu Unity 
3 years ago. https://bugs.python.org/issue14799


I would appreciate it if some people could run the linux version of
py -3.4 -m test -ugui test_tk test_ttk_guionly test_idle
(or 3.5).  I guess this means 'python3 for the executable.

and report here python version, linux system, and result.
Alteration of environment and locale is a known issue, skip that.


$ uname -a
Linux term 3.13.0-24-generic #47-Ubuntu SMP Fri May 2 23:30:00 UTC 2014 
x86_64 x86_64 x86_64 GNU/Linux


$ python3 -m test -ugui test_tk test_ttk_guionly test_idle
[1/3] test_tk
[2/3] test_ttk_guionly
[3/3] test_idle
All 3 tests OK.


--
https://mail.python.org/mailman/listinfo/python-list


Re: Looking for direction

2015-05-21 Thread 20/20 Lab
Your the second to recommend this to me.  I ended up picking it up last 
week.  So I need to sit down with it.  I was able to get a working 
project.  However, I dont fully grasp the details on how. So the book 
will help I'm sure.


Thank you.

On 05/20/2015 05:50 AM, darnold via Python-list wrote:

I recommend getting your hands on "Automate The Boring Stuff With Python" from 
no starch press:

http://www.nostarch.com/automatestuff

I've not read it in its entirety, but it's very beginner-friendly and is 
targeted at just the sort of processing you appear to be doing.

HTH,
Don


--
https://mail.python.org/mailman/listinfo/python-list


Re: Looking for direction

2015-05-15 Thread 20/20 Lab



On 05/13/2015 06:12 PM, Dave Angel wrote:

On 05/13/2015 08:45 PM, 20/20 Lab wrote:>

You accidentally replied to me, rather than the mailing list. Please 
use reply-list, or if your mailer can't handle that, do a Reply-All, 
and remove the parts you don't want.


>
> On 05/13/2015 05:07 PM, Dave Angel wrote:
>> On 05/13/2015 07:24 PM, 20/20 Lab wrote:
>>> I'm a beginner to python.  Reading here and there. Written a 
couple of

>>> short and simple programs to make life easier around the office.
>>>
>> Welcome to Python, and to this mailing list.
>>
>>> That being said, I'm not even sure what I need to ask for. I've never
>>> worked with external data before.
>>>
>>> I have a LARGE csv file that I need to process.  110+ columns, 72k
>>> rows.
>>
>> That's not very large at all.
>>
> In the grand scheme, I guess not.  However I'm currently doing this
> whole process using office.  So it can be a bit daunting.

I'm not familiar with the "office" operating system.

>>>  I managed to write enough to reduce it to a few hundred rows, and
>>> the five columns I'm interested in.
>>
>>>
>>> Now is were I have my problem:
>>>
>>> myList = [ [123, "XXX", "Item", "Qty", "Noise"],
>>> [72976, "YYY", "Item", "Qty", "Noise"],
>>> [123, "XXX" "ItemTypo", "Qty", "Noise"]]
>>>
>>
>> It'd probably be useful to identify names for your columns, even if
>> it's just in a comment.  Guessing from the paragraph below, I figure
>> the first two columns are "account" & "staff"
>
> The columns that I pull are Account, Staff, Item Sold, Quantity sold,
> and notes about the sale (notes arent particularly needed, but the
> higher ups would like them in the report)
>>
>>> Basically, I need to check for rows with duplicate accounts row[0] 
and

>>> staff (row[1]), and if so, remove that row, and add it's Qty to the
>>> original row.
>>
>> And which column is that supposed to be?  Shouldn't there be a number
>> there, rather than a string?
>>
>>> I really dont have a clue how to go about this.  The
>>> number of rows change based on which run it is, so I couldnt even get
>>> away with using hundreds of compare loops.
>>>
>>> If someone could point me to some documentation on the functions I 
would

>>> need, or a tutorial it would be a great help.
>>>
>>
>> Is the order significant?  Do you have to preserve the order that the
>> accounts appear?  I'll assume not.
>>
>> Have you studied dictionaries?  Seems to me the way to handle the
>> problem is to read in a row, create a dictionary with key of (account,
>> staff), and data of the rest of the line.
>>
>> Each time you read a row, you check if the key is already in the
>> dictionary.  If not, add it.  If it's already there, merge the data as
>> you say.
>>
>> Then when you're done, turn the dict back into a list of lists.
>>
> The order is irrelevant.  No, I've not really studied dictionaries, but
> a few people have mentioned it.  I'll have to read up on them and, more
> importantly, their applications.  Seems that they are more versatile
> then I thought.
>
> Thank you.

You have to realize that a tuple can be used as a key, in your case a 
tuple of Account and Staff.


You'll have to decide how you're going to merge the ItemSold, 
QuantitySold, and notes.


Tells you how often I actually talk in mailing lists.  My apologies, and 
thank you again.

--
https://mail.python.org/mailman/listinfo/python-list


Re: Looking for direction

2015-05-15 Thread 20/20 Lab



On 05/13/2015 06:12 PM, Dave Angel wrote:

On 05/13/2015 08:45 PM, 20/20 Lab wrote:>

You accidentally replied to me, rather than the mailing list. Please 
use reply-list, or if your mailer can't handle that, do a Reply-All, 
and remove the parts you don't want.


...and now that you mention it.  I appear to have done that with all of 
my replies yesterday.


My deepest apologies for that.

--
https://mail.python.org/mailman/listinfo/python-list


Re: Looking for direction

2015-05-14 Thread 20/20 Lab



On 05/13/2015 06:23 PM, Steven D'Aprano wrote:

On Thu, 14 May 2015 09:24 am, 20/20 Lab wrote:


I'm a beginner to python.  Reading here and there.  Written a couple of
short and simple programs to make life easier around the office.

That being said, I'm not even sure what I need to ask for. I've never
worked with external data before.

I have a LARGE csv file that I need to process.  110+ columns, 72k
rows.  I managed to write enough to reduce it to a few hundred rows, and
the five columns I'm interested in.

That's not large. Large is millions of rows, or tens of millions if you have
enough memory. What's large to you and me is usually small to the computer.

You should use the csv module for handling the CSV file, if you aren't
already doing so. Do you need a url to the docs?

I actually stumbled across the csv module after coding enough to make a 
list of lists.  So that is more the reason I approached the list;  
Nothing like spending hours (or days) coding something that already 
exists and just dont know about.

Now is were I have my problem:

myList = [ [123, "XXX", "Item", "Qty", "Noise"],
 [72976, "YYY", "Item", "Qty", "Noise"],
 [123, "XXX" "ItemTypo", "Qty", "Noise"]]

Basically, I need to check for rows with duplicate accounts row[0] and
staff (row[1]), and if so, remove that row, and add it's Qty to the
original row. I really dont have a clue how to go about this.

Is the order of the rows important? If not, the problem is simpler.


processed = {}  # hold the processed data in a dict

for row in myList:
 account, staff = row[0:2]
 key = (account, staff)  # Put them in a tuple.
 if key in processed:
 # We've already seen this combination.
 processed[key][3] += row[3]  # Add the quantities.
 else:
 # Never seen this combination before.
 processed[key] = row

newlist = list(processed.values())


Does that help?



It does, immensely.  I'll make this work.  Thank you again for the link 
from yesterday and apologies for hitting the wrong reply button.  I'll 
have to study more on the usage and implementations of dictionaries and 
tuples.

--
https://mail.python.org/mailman/listinfo/python-list


Looking for direction

2015-05-13 Thread 20/20 Lab
I'm a beginner to python.  Reading here and there.  Written a couple of 
short and simple programs to make life easier around the office.


That being said, I'm not even sure what I need to ask for. I've never 
worked with external data before.


I have a LARGE csv file that I need to process.  110+ columns, 72k 
rows.  I managed to write enough to reduce it to a few hundred rows, and 
the five columns I'm interested in.


Now is were I have my problem:

myList = [ [123, "XXX", "Item", "Qty", "Noise"],
   [72976, "YYY", "Item", "Qty", "Noise"],
   [123, "XXX" "ItemTypo", "Qty", "Noise"]]

Basically, I need to check for rows with duplicate accounts row[0] and 
staff (row[1]), and if so, remove that row, and add it's Qty to the 
original row. I really dont have a clue how to go about this.  The 
number of rows change based on which run it is, so I couldnt even get 
away with using hundreds of compare loops.


If someone could point me to some documentation on the functions I would 
need, or a tutorial it would be a great help.


Thank you.
--
https://mail.python.org/mailman/listinfo/python-list