Re: [Tutor] Dictionaries and aggregation

2006-04-25 Thread Paul Churchill



Right think I've got the idea now. Thanks for all contributions on this.


Paul

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf
Of Karl "Pflästerer"
Sent: 25 April 2006 22:28
To: tutor@python.org
Subject: Re: [Tutor] Dictionaries and aggregation

On 25 Apr 2006, [EMAIL PROTECTED] wrote:

[...]
> Here's the list I'm starting with: 
>
 for i in rLst:
 print i, type(i)
>
> server001  alive 17.1% 2 requests/s 14805416 total 
> server001  alive 27.2% 7 requests/s 14851125 total 
> server002  alive 22.9% 6 requests/s 15173311 total 
> server002  alive 42.0% 8 requests/s 15147869 total 
> server003  alive 17.9% 4 requests/s 15220280 total 
> server003  alive 22.0% 4 requests/s 15260951 total 
> server004  alive 18.5% 3 requests/s 15484524 total 
> server004  alive 31.6% 9 requests/s 15429303 total  
>
> I've split each string in the list, extracted what I want and feed it into

> an empty dictionary. 
>
 rDict ={}
 i = 0
 while i < (len(rLst)):
 x, y =  rLst[i].split()[0], int(rLst[i].split()[3])
 rDict[x] = y
 print x, y, type(x), type(y)
 i += 1
[...]
> What I'm hoping to be able to do is update the value, rather than replace 
> it,  so that it gives me the total i.e. 

> server001 9   
> server003 10
> server002 14
> server004 20 

This is easily done.

.>>> rdict = {}
.>>> for line in lst:
 ans = line.split()
 rdict[ans[0]] = rdict.get(ans[0], 0) + int(ans[3])
 
.>>> rdict
.{'server002': 14, 'server003': 8, 'server001': 9, 'server004': 12}

Dictionaries have a get() method, which has an optional second argument,
which gets returned if there is no such key in the dictionary.

   Karl
-- 
Please do *not* send copies of replies to me.
I read the list
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Dictionaries and aggregation

2006-04-25 Thread paul . churchill
Kent Johnson writes: 

>> However here's what I'm now trying to do:
>>  
>> 1)   Not have to rely on using awk at all.
>>  
>>  
>> 2)   Create a dictionary with server names for keys e.g. server001,
>> server002 etc and the aggregate of the request for that server as the value
>> part of the pairing.
>>  
>>  
>> I got this far with part 1)
>>  
>> lbstat = commands.getoutput("sudo ~/ZLBbalctl --action=cells")
>> tmpLst = lbstat.split('\n')
>>  
>> rLst = []
>> for i in tmpLst:
>> m = re.search(' server[0-9]+', i)
>> if m:
>> rLst.append(i)
>>  
>> for i in rLst:
>> print i, type(i)
>>  
>>   server001  alive 22.3% 6 requests/s 14527762 total 
>>   server002  alive 23.5% 7 requests/s 14833265 total 
>>   server003  alive 38.2%14 requests/s 14872750 total 
>>   server004  alive 15.6% 4 requests/s 15083443 total 
>>   server001  alive 24.1% 8 requests/s 14473672 total 
>>   server002  alive 23.2% 7 requests/s 14810866 total 
>>   server003  alive 30.2% 8 requests/s 14918322 total 
>>   server004  alive 22.1% 6 requests/s 15137847 total 
>>  
>> At this point I ran out of ideas and began to think that there must be
>> something fundamentally wrong with my approach. Not least of my concerns was
>> the fact that I needed integers and these were strings.
> 
> Don't get discouraged, you are on the right track! You had one big 
> string that included some data you are interested in and some you don't 
> want, you have converted that to a list of strings containing only the 
> lines of interest. That is a good first step. Now you have to extract 
> the data you want out of each line. 
> 
> Use line.split() to split the text into fields by whitespace:
> In [1]: line = '  server001  alive 22.3% 6 requests/s 14527762 
> total' 
> 
> In [2]: line.split()
> Out[2]: ['server001', 'alive', '22.3%', '6', 'requests/s', '14527762', 
> 'total'] 
> 
> Indexing will pull out the field you want:
> In [3]: line.split()[5]
> Out[3]: '14527762' 
> 
> It's still a string:
> In [4]: type(line.split()[5])
> Out[4]:  
> 
> Use int() to convert a string to an integer:
> In [5]: int(line.split()[5])
> Out[5]: 14527762 
> 
> Then you have to figure out how to accumulate the values in a dictionary 
> but get this much working first. 
> 
> Kent 
> 
> ___
> Tutor maillist  -  Tutor@python.org
> http://mail.python.org/mailman/listinfo/tutor 
> 
 


Thanks very much for the steer. I've made a fair bit of progress and look to 
be within touching distance of getting the problem cracked. 

 

Here's the list I'm starting with: 

>>> for i in rLst:
>>> print i, type(i)

server001  alive 17.1% 2 requests/s 14805416 total 
server001  alive 27.2% 7 requests/s 14851125 total 
server002  alive 22.9% 6 requests/s 15173311 total 
server002  alive 42.0% 8 requests/s 15147869 total 
server003  alive 17.9% 4 requests/s 15220280 total 
server003  alive 22.0% 4 requests/s 15260951 total 
server004  alive 18.5% 3 requests/s 15484524 total 
server004  alive 31.6% 9 requests/s 15429303 total  

I've split each string in the list, extracted what I want and feed it into 
an empty dictionary. 

>>> rDict ={}
>>> i = 0
>>> while i < (len(rLst)):
>>> x, y =  rLst[i].split()[0], int(rLst[i].split()[3])
>>> rDict[x] = y
>>> print x, y, type(x), type(y)
>>> i += 1

server001 4  
server001 5  
server002 5  
server002 9  
server003 4  
server003 6  
server004 8  
server004 12   

I end up with this. 

>>> for key, value in rDict.items():
>>> print key, value

server001 5
server003 6
server002 9
server004 12 


As I understand things this is because the keys must be unique and are being 
replaced by the final key value pair being feed in from the loop. 

What I'm hoping to be able to do is update the value, rather than replace 
it,  so that it gives me the total i.e. 

server001 9 
server003 10
server002 14
server004 20 

 

Regards, 

Paul 

 

 

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


[Tutor] Dictionaries and aggregation

2006-04-24 Thread Paul Churchill








 

I am trying to create a dictionary using data produced by a
load balancing admin tool and aggregate the results. 

 

 

When I invoke the tool from within the shell (‘sudo
~/ZLBbalctl --action="" the following output is produced:

 

Load Balancer 1 usage, over the last 30 seconds

Port 80, rules - /(nol)|(ws)

  server001  alive 18.1% 2 requests/s 14536543
total

  server002  alive 43.1% 7 requests/s 14842618
total

  server003  alive 21.2% 2 requests/s 14884487
total

  server004  alive 17.3% 2 requests/s 15092053
total

 

 

Load Balancer 2 usage, over the last 30 seconds

Port 80, rules - /(nol)|(ws)

  server001  alive 11.6% 2 requests/s 14482578
total

  server002  alive 35.6% 9 requests/s 14820991
total

  server003  alive 28.7% 6 requests/s 14928991
total

  server004  alive 23.7% 5 requests/s 15147525
total

  

 

 

I have managed to get something close to what I’m
looking for using lists i.e. the aggregate of the fourth column (requests/s)

 

lbstat = commands.getoutput("sudo ~/ZLBbalctl
--action="" | awk '$1 ~ /^server00/ { print $4 }'")  

rLst = lbstat.split('\n')

rLst = [ int(rLst[i]) for i in range(len(rLst)) ]

rTotal = reduce(operator.add, rLst)

 

However here’s what I’m now trying to do:

 

1)   Not have to
rely on using awk at all.

 

 

2)   Create a
dictionary with server names for keys e.g. server001, server002 etc and the aggregate
of the request for that server as the value part of the pairing.

 

 

I got this far with part 1)

 

lbstat =
commands.getoutput("sudo ~/ZLBbalctl
--action="">

tmpLst = lbstat.split('\n')

 

rLst = []

for i in tmpLst:

    m =
re.search('
server[0-9]+', i)

    if m:

    rLst.append(i)

 

for i in rLst:

    print i, type(i)

 

  

  

  server001  alive 22.3% 6 requests/s 14527762
total str'>

  server002  alive 23.5% 7 requests/s 14833265
total str'>

  server003  alive 38.2%    14 requests/s 14872750 total
str'>

  server004  alive 15.6% 4 requests/s 15083443
total str'>

  server001  alive 24.1% 8 requests/s 14473672
total str'>

  server002  alive 23.2% 7 requests/s 14810866
total str'>

  server003   
  alive 30.2% 8 requests/s 14918322
total str'>

  server004  alive 22.1% 6 requests/s 15137847
total str'>

 

 

 

 

At this point I ran out of ideas and began to think that
there must be something fundamentally wrong with my approach. Not least of my
concerns was the fact that I needed integers and these were strings.

 

Any help would be much appreciated.

 

Regards,

 

Paul

 

 

 

 

 






___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor