Nested dictionaries trouble

2007-04-11 Thread IamIan
Hello,

I'm writing a simple FTP log parser that sums file sizes as it runs. I
have a yearTotals dictionary with year keys and the monthTotals
dictionary as its values. The monthTotals dictionary has month keys
and file size values. The script works except the results are written
for all years, rather than just one year. I'm thinking there's an
error in the way I set my dictionaries up or reference them...

import glob, traceback

years = ["2005", "2006", "2007"]
months = ["01","02","03","04","05","06","07","08","09","10","11","12"]
# Create months dictionary to convert log values
logMonths =
{"Jan":"01","Feb":"02","Mar":"03","Apr":"04","May":"05","Jun":"06","Jul":"07","Aug":"08","Sep":"09","Oct":"10","Nov":"11","Dec":"12"}
# Create monthTotals dictionary with default 0 value
monthTotals = dict.fromkeys(months, 0)
# Nest monthTotals dictionary in yearTotals dictionary
yearTotals = {}
for year in years:
  yearTotals.setdefault(year, monthTotals)

currentLogs = glob.glob("/logs/ftp/*")

try:
  for currentLog in currentLogs:
readLog = open(currentLog,"r")
for line in readLog.readlines():
  if not line: continue
  if len(line) < 50: continue
  logLine = line.split()

  # The 2nd element is month, 5th is year, 8th is filesize
  # Counting from zero:

  # Lookup year/month pair value
  logMonth = logMonths[logLine[1]]
  currentYearMonth = yearTotals[logLine[4]][logMonth]

  # Update year/month value
  currentYearMonth += int(logLine[7])
  yearTotals[logLine[4]][logMonth] = currentYearMonth
except:
  print "Failed on: " + currentLog
  traceback.print_exc()

# Print dictionaries
for x in yearTotals.keys():
  print "KEY",'\t',"VALUE"
  print x,'\t',yearTotals[x]
  #print "  key",'\t',"value"
  for y in yearTotals[x].keys():
print "  ",y,'\t',yearTotals[x][y]


Thank you,
Ian

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Nested dictionaries trouble

2007-04-11 Thread Gabriel Genellina
En Wed, 11 Apr 2007 15:57:56 -0300, IamIan <[EMAIL PROTECTED]> escribió:

> I'm writing a simple FTP log parser that sums file sizes as it runs. I
> have a yearTotals dictionary with year keys and the monthTotals
> dictionary as its values. The monthTotals dictionary has month keys
> and file size values. The script works except the results are written
> for all years, rather than just one year. I'm thinking there's an
> error in the way I set my dictionaries up or reference them...

> monthTotals = dict.fromkeys(months, 0)
> # Nest monthTotals dictionary in yearTotals dictionary
> yearTotals = {}
> for year in years:
>   yearTotals.setdefault(year, monthTotals)

All your years share the *same* monthTotals object.
This is similar to this FAQ entry:  

You have to create a new dict for each year; replace the above code with:

yearTotals = {}
for year in years:
 yearTotals[year] = dict.fromkeys(months, 0)

-- 
Gabriel Genellina
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Nested dictionaries trouble

2007-04-11 Thread Terry Reedy

"IamIan" <[EMAIL PROTECTED]> wrote in message 
news:[EMAIL PROTECTED]
| Hello,
|
| I'm writing a simple FTP log parser that sums file sizes as it runs. I
| have a yearTotals dictionary with year keys and the monthTotals
| dictionary as its values. The monthTotals dictionary has month keys
| and file size values. The script works except the results are written
| for all years, rather than just one year. I'm thinking there's an
| error in the way I set my dictionaries up or reference them...
|
| import glob, traceback
|
| years = ["2005", "2006", "2007"]
| months = ["01","02","03","04","05","06","07","08","09","10","11","12"]
| # Create months dictionary to convert log values
| logMonths =
| 
{"Jan":"01","Feb":"02","Mar":"03","Apr":"04","May":"05","Jun":"06","Jul":"07","Aug":"08","Sep":"09","Oct":"10","Nov":"11","Dec":"12"}
| # Create monthTotals dictionary with default 0 value
| monthTotals = dict.fromkeys(months, 0)
| # Nest monthTotals dictionary in yearTotals dictionary
| yearTotals = {}
| for year in years:
|  yearTotals.setdefault(year, monthTotals)

try yearTotals.setdefault(year, dict.fromkeys(months, 0))
so you start with a separate subdict for each year instead of 1 for all.

tjr



-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Nested dictionaries trouble

2007-04-11 Thread Bruno Desthuilliers
IamIan a écrit :
> Hello,
> 
> I'm writing a simple FTP log parser that sums file sizes as it runs. I
> have a yearTotals dictionary with year keys and the monthTotals
> dictionary as its values. The monthTotals dictionary has month keys
> and file size values. The script works except the results are written
> for all years, rather than just one year. I'm thinking there's an
> error in the way I set my dictionaries up or reference them...
> 
> import glob, traceback
> 
> years = ["2005", "2006", "2007"]
> months = ["01","02","03","04","05","06","07","08","09","10","11","12"]
> # Create months dictionary to convert log values
> logMonths =
> {"Jan":"01","Feb":"02","Mar":"03","Apr":"04","May":"05","Jun":"06","Jul":"07","Aug":"08","Sep":"09","Oct":"10","Nov":"11","Dec":"12"}

DRY violation alert !

logMonths = {
   "Jan":"01",
   "Feb":"02",
   "Mar":"03",
   "Apr":"04",
   "May":"05",
   #etc
}

months = sorted(logMonths.values())

> # Create monthTotals dictionary with default 0 value
> monthTotals = dict.fromkeys(months, 0)
> # Nest monthTotals dictionary in yearTotals dictionary
> yearTotals = {}
> for year in years:
>   yearTotals.setdefault(year, monthTotals)

A complicated way to write:
yearTotals = dict((year, monthTotals) for year in years)

And without even reading further, I can tell you have a problem here: 
all 'year' entry in yearTotals points to *the same* monthTotal dict 
instance. So when updating yearTotals['2007'], you see the change 
reflected for all years. The cure is simple: forget the monthTotals 
object, and define your yearTotals dict this way:

yearTotals = dict((year, dict.fromkeys(months, 0)) for year in years)

NB : for Python versions < 2.4.x, you need a list comp instead of a 
generator expression, ie:

yearTotals = dict([(year, dict.fromkeys(months, 0)) for year in years])

HTH
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Nested dictionaries trouble

2007-04-11 Thread 7stud
1) You have this setup:

logMonths = {"Jan":"01", "Feb":"02",...}
yearTotals = {
"2005":{"01":0, "02":0, }
"2006":
"2007":
}

Then when you get a value such as "Jan", you look up the "Jan" in the
logMonths dictionary to get "01".  Then you use "01" and the year, say
"2005", to look up the value in the yearTotals dictionary.  Why do
that?  What is the point of even having the logMonths dictionary?  Why
not make "Jan" the key in the the "2005" dictionary and look it up
directly:

yearTotals = {
"2005":{"Jan":0, "Feb":0, }
"2006":
"2007":
}

That way you could completely eliminate the lookup in the logMonths
dict.

2) In this part:

logMonth = logMonths[logLine[1]]
currentYearMonth = yearTotals[logLine[4]][logMonth]
# Update year/month value
currentYearMonth += int(logLine[7])
yearTotals[logLine[4]][logMonth] = currentYearMonth

I'm not sure why you are using all those intermediate steps.  How
about:

yearTotals[logLine[4]][logLine[1]] += int(logLine[7])

To me that is a lot clearer.  Or, you could do this:

year, month, val = logLine[4], logLine[1], int(logLine[7])
yearTotals[year][month] += val

3)
>I'm thinking there's an error in the way
>I set my dictionaries up or reference them

Yep.  It's right here:

for year in years:
  yearTotals.setdefault(year, monthTotals)

Every year refers to the same monthTotals dict.  You can use a dicts
copy() function to make a copy:

monthTotals.copy()

Here is a reworking of your code that also eliminates a lot of typing:

import calendar, pprint

years = ["200%s" % x for x in range(5, 8)]
print years

months = list(calendar.month_abbr)
print months

monthTotals = dict.fromkeys(months[1:], 0)
print monthTotals

yearTotals = {}
for year in years:
yearTotals.setdefault(year, monthTotals.copy())
pprint.pprint(yearTotals)

logs = [
["", "Feb", "", "", "2007", "", "", "12"],
["", "Jan", "", "", "2005", "", "", "3"],
["", "Jan", "", "", "2005", "", "", "7"],
]

for logLine in logs:
year, month, val = logLine[4], logLine[1], int(logLine[7])
yearTotals[year][month] += val

for x in yearTotals.keys():
print "KEY", "\t", "VALUE"
print x, "\t", yearTotals[x]
for y in yearTotals[x].keys():
print "   ", y, "\t", yearTotals[x][y]

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Nested dictionaries trouble

2007-04-11 Thread 7stud

IamIan wrote:
> Hello,
>
> I'm writing a simple FTP log parser that sums file sizes as it runs. I
> have a yearTotals dictionary with year keys and the monthTotals
> dictionary as its values. The monthTotals dictionary has month keys
> and file size values. The script works except the results are written
> for all years, rather than just one year. I'm thinking there's an
> error in the way I set my dictionaries up or reference them...
>
> import glob, traceback
>
> years = ["2005", "2006", "2007"]
> months = ["01","02","03","04","05","06","07","08","09","10","11","12"]
> # Create months dictionary to convert log values
> logMonths =
> {"Jan":"01","Feb":"02","Mar":"03","Apr":"04","May":"05","Jun":"06","Jul":"07","Aug":"08","Sep":"09","Oct":"10","Nov":"11","Dec":"12"}
> # Create monthTotals dictionary with default 0 value
> monthTotals = dict.fromkeys(months, 0)
> # Nest monthTotals dictionary in yearTotals dictionary
> yearTotals = {}
> for year in years:
>   yearTotals.setdefault(year, monthTotals)
>
> currentLogs = glob.glob("/logs/ftp/*")
>
> try:
>   for currentLog in currentLogs:
> readLog = open(currentLog,"r")
> for line in readLog.readlines():
>   if not line: continue
>   if len(line) < 50: continue
>   logLine = line.split()
>
>   # The 2nd element is month, 5th is year, 8th is filesize
>   # Counting from zero:
>
>   # Lookup year/month pair value
>   logMonth = logMonths[logLine[1]]
>   currentYearMonth = yearTotals[logLine[4]][logMonth]
>
>   # Update year/month value
>   currentYearMonth += int(logLine[7])
>   yearTotals[logLine[4]][logMonth] = currentYearMonth
> except:
>   print "Failed on: " + currentLog
>   traceback.print_exc()
>
> # Print dictionaries
> for x in yearTotals.keys():
>   print "KEY",'\t',"VALUE"
>   print x,'\t',yearTotals[x]
>   #print "  key",'\t',"value"
>   for y in yearTotals[x].keys():
> print "  ",y,'\t',yearTotals[x][y]
>
>
> Thank you,
> Ian


1) You have this setup:

logMonths = {"Jan":"01", "Feb":"02",...}
yearTotals = {
"2005":{"01":0, "02":0, }
"2006":
"2007":
}

Then when you get a  result such as "Jan", you look up  "Jan" in the
logMonths dictionary to get "01".  Then you use "01" and the year, say
"2005", to look up the value in the yearTotals dictionary.  What is
the point of even having the logMonths dictionary?  Why not make "Jan"
the key in the the "2005" dictionary and look it up directly:

yearTotals = {
"2005":{"Jan":0, "Feb":0, }
"2006":
"2007":
}

That way you could completely eliminate the lookup in the logMonths
dict.

2) In this part:

logMonth = logMonths[logLine[1]]
currentYearMonth = yearTotals[logLine[4]][logMonth]
# Update year/month value
currentYearMonth += int(logLine[7])
yearTotals[logLine[4]][logMonth] = currentYearMonth

I'm not sure why you are using all those intermediate steps.  How
about:

yearTotals[logLine[4]][logLine[1]] += int(logLine[7])

To me that is a lot clearer.  Or, you could do this:

year, month, val = logLine[4], logLine[1], int(logLine[7])
yearTotals[year][month] += val

3)
>I'm thinking there's an error in the way
>I set my dictionaries up or reference them

Yep.  It's right here:

for year in years:
  yearTotals.setdefault(year, monthTotals)

Every year refers to the same monthTotals dict.  You can use a dict's
copy() function to make a copy:

monthTotals.copy()

Here is a reworking of your code that also eliminates a lot of typing:

import calendar, pprint

years = ["200%s" % x for x in range(5, 8)]
print years

months = list(calendar.month_abbr)
print months

monthTotals = dict.fromkeys(months[1:], 0)
print monthTotals

yearTotals = {}
for year in years:
yearTotals.setdefault(year, monthTotals.copy())
pprint.pprint(yearTotals)

logs = [
["", "Feb", "", "", "2007", "", "", "12"],
["", "Jan", "", "", "2005", "", "", "3"],
["", "Jan", "", "", "2005", "", "", "7"],
]

for logLine in logs:
year, month, val = logLine[4], logLine[1], int(logLine[7])
yearTotals[year][month] += val

for x in yearTotals.keys():
print "KEY", "\t", "VALUE"
print x, "\t", yearTotals[x]
for y in yearTotals[x].keys():
print "   ", y, "\t", yearTotals[x][y]

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Nested dictionaries trouble

2007-04-11 Thread IamIan
Thank you everyone for the helpful replies. Some of the solutions were
new to me, but the script now runs successfully. I'm still learning to
ride the snake but I love this language!

Ian

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Nested dictionaries trouble

2007-04-11 Thread 7stud
On Apr 11, 2:57 pm, Bruno Desthuilliers
<[EMAIL PROTECTED]> wrote:
> IamIan a écrit :
>
> yearTotals = dict([(year, dict.fromkeys(months, 0)) for year in years])
>
> HTH

List comprehensions without a list?  What? Where? How?


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Nested dictionaries trouble

2007-04-11 Thread 7stud
On Apr 11, 7:01 pm, "7stud" <[EMAIL PROTECTED]> wrote:
> On Apr 11, 2:57 pm, Bruno Desthuilliers
>
> <[EMAIL PROTECTED]> wrote:
> > IamIan a écrit :
>
> > yearTotals = dict([(year, dict.fromkeys(months, 0)) for year in years])
>
> > HTH
>
> List comprehensions without a list?  What? Where? How?

Ooops.  I copied the wrong one.  I was looking at this one:

yearTotals = dict((year, monthTotals) for year in years)

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Nested dictionaries trouble

2007-04-11 Thread 7stud
On Apr 11, 7:28 pm, "7stud" <[EMAIL PROTECTED]> wrote:
> On Apr 11, 7:01 pm, "7stud" <[EMAIL PROTECTED]> wrote:
>
> > On Apr 11, 2:57 pm, Bruno Desthuilliers
>
> > <[EMAIL PROTECTED]> wrote:
> > > IamIan a écrit :
>
> > > yearTotals = dict([(year, dict.fromkeys(months, 0)) for year in years])
>
> > > HTH
>
> > List comprehensions without a list?  What? Where? How?
>
> Ooops.  I copied the wrong one.  I was looking at this one:
>
> yearTotals = dict((year, monthTotals) for year in years)

Never mind.  I found this PEP:

http://www.python.org/dev/peps/pep-0289/

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Nested dictionaries trouble

2007-04-18 Thread IamIan
I am using the suggested approach to make a years list:

years = ["199%s" % x for x in range(0,10)]
years += ["200%s" % x for x in range(0,10)]

I haven't had any luck doing this in one line though. Is it possible?

Thanks.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Nested dictionaries trouble

2007-04-18 Thread Steven W. Orr
On Wednesday, Apr 18th 2007 at 12:16 -0700, quoth IamIan:

=>I am using the suggested approach to make a years list:
=>
=>years = ["199%s" % x for x in range(0,10)]
=>years += ["200%s" % x for x in range(0,10)]
=>
=>I haven't had any luck doing this in one line though. Is it possible?

I'm so green that I almost get a chubby at being able to answer something. 
;-)

years = [str(1990+x) for x in range(0,20)]

Yes?

-- 
Time flies like the wind. Fruit flies like a banana. Stranger things have  .0.
happened but none stranger than this. Does your driver's license say Organ ..0
Donor?Black holes are where God divided by zero. Listen to me! We are all- 000
individuals! What if this weren't a hypothetical question?
steveo at syslang.net
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Nested dictionaries trouble

2007-04-18 Thread Marc 'BlackJack' Rintsch
In <[EMAIL PROTECTED]>, IamIan wrote:

> years = ["199%s" % x for x in range(0,10)]
> years += ["200%s" % x for x in range(0,10)]
> 
> I haven't had any luck doing this in one line though. Is it possible?

In [48]: years = map(str, xrange(1999, 2011))

In [49]: years
Out[49]:
['1999',
 '2000',
 '2001',
 '2002',
 '2003',
 '2004',
 '2005',
 '2006',
 '2007',
 '2008',
 '2009',
 '2010']

Ciao,
Marc 'BlackJack' Rintsch
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Nested dictionaries trouble

2007-04-18 Thread Steven D'Aprano
On Wed, 18 Apr 2007 12:16:12 -0700, IamIan wrote:

> I am using the suggested approach to make a years list:
> 
> years = ["199%s" % x for x in range(0,10)]
> years += ["200%s" % x for x in range(0,10)]
> 
> I haven't had any luck doing this in one line though. Is it possible?

years = ["199%s" % x for x in range(0,10)] + \
["200%s" % x for x in range(0,10)]

Sorry for the line continuation, my news reader insists on breaking the
line. In your editor, just delete the "\" and line break to make it a
single line.


If you don't like that solution, here's a better one:

years = [str(1990 + n) for n in range(20)]

Or there's this:

years = [str(n) for n in range(1990, 2010)]

Or this one:

years = map(str, range(1990, 2010))


-- 
Steven.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Nested dictionaries trouble

2007-04-19 Thread IamIan
Thank you again for the great suggestions. I have one final question
about creating a httpMonths dictionary like {'Jan':'01' , 'Feb':'02' ,
etc} with a minimal amount of typing. My code follows (using Python
2.3.4):

import calendar

# Create years list, formatting as strings
years = map(str, xrange(1990,2051))

# Create months list with three letter abbreviations
months = list(calendar.month_abbr)

# Create monthTotals dictionary with default value of zero
monthTotals = dict.fromkeys(months[1:],0)

# Create yearTotals dictionary with years for keys
# and copies of the monthTotals dictionary for values
yearTotals = dict([(year, monthTotals.copy()) for year in years])

# Create httpMonths dictionary to map month abbreviations
# to Apache numeric month representations
httpMonths =
{"Jan":"01","Feb":"02","Mar":"03","Apr":"04","May":"05","Jun":"06","Jul":"07","Aug":"08","Sep":"09","Oct":"10","Nov":"11","Dec":"12"}

It is this last step I'm referring to. I got close with:
httpMonths = {}
for month in months[1:]:
  httpMonths[month] = str(len(httpMonths)+1)

but the month numbers are missing the leading zero for 01-09. Thanks!

Ian

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Nested dictionaries trouble

2007-04-21 Thread rzed
IamIan <[EMAIL PROTECTED]> wrote in
news:[EMAIL PROTECTED]: 

> Thank you again for the great suggestions. I have one final
> question about creating a httpMonths dictionary like {'Jan':'01'
> , 'Feb':'02' , etc} with a minimal amount of typing. My code
> follows (using Python 2.3.4):
> 
> import calendar
> 
> # Create years list, formatting as strings
> years = map(str, xrange(1990,2051))
> 
> # Create months list with three letter abbreviations
> months = list(calendar.month_abbr)
> 
> # Create monthTotals dictionary with default value of zero
> monthTotals = dict.fromkeys(months[1:],0)
> 
> # Create yearTotals dictionary with years for keys
> # and copies of the monthTotals dictionary for values
> yearTotals = dict([(year, monthTotals.copy()) for year in
> years]) 
> 
> # Create httpMonths dictionary to map month abbreviations
> # to Apache numeric month representations
> httpMonths =
> {"Jan":"01","Feb":"02","Mar":"03","Apr":"04","May":"05","Jun":"0
6
> ","Jul":"07","Aug":"08","Sep":"09","Oct":"10","Nov":"11","Dec":"
1
> 2"} 
> 
> It is this last step I'm referring to. I got close with:
> httpMonths = {}
> for month in months[1:]:
>   httpMonths[month] = str(len(httpMonths)+1)
> 
> but the month numbers are missing the leading zero for 01-09.
> Thanks! 
> 

Maybe something like: 
httpMonths = dict((k,"%02d" % (x+1)) 
for x,k in enumerate(months[1:]) )

-- 
rzed
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Nested dictionaries trouble

2007-04-21 Thread Bruno Desthuilliers
IamIan a écrit :
> I am using the suggested approach to make a years list:
> 
> years = ["199%s" % x for x in range(0,10)]
> years += ["200%s" % x for x in range(0,10)]
> 
> I haven't had any luck doing this in one line though. Is it possible?

# Q, D and pretty obvious
years = ["199%s" % x for x in range(0,10)] + ["200%s" % x for x in 
range(0,10)]

# hardly more involved, and quite more generic
years = ["%s%s" % (c, y) for c in ("199", "201") for y in range(10)]
-- 
http://mail.python.org/mailman/listinfo/python-list