unique values of a Dictionary list (removing duplicate elements of a list)

2010-05-21 Thread Chad Kellerman
Python users,
  I am parsing an AIX trace file and creating a dictionary containing
keys (PIDS) and values (a list of TIDS).  With PIDS being unique process ids
and TIDS, being a list of thread ids.  My function populates the keys so
that they are unique, but my list contains duplicates.

 Can someone point me in the right direction so that my dictionary value
does not contain duplicate elements?


here is what I got.

--portion of code that is relevant--

pidtids  = {}

# --- function to add pid and tid to a dictionary
def addpidtids(pid,tid):
pidtids.setdefault(pid,[]).append(tid)

# --- function to parse a file
def grep(pattern, fileObj, include_line_nums=False):
r=[]
compgrep = re.compile(pattern)

for line_num, line in enumerate(fileObj):
if compgrep.search(line):
info = line.split()
p = info[7].lstrip(pid=)
t = info[8].lstrip(tid=)
addpidtids(p,t)


# process trace.int
tf = open(tracefile, 'r')
grep(cmd=java pid,tf)
tf.close()

--/portion of code that is relevant--

Any help would be greatly appreciated.


Thanks,
Chad

-- 
A grasshopper walks into a bar and the bartender says Hey, we have a drink
named after you. And the grasshopper says Really, You have a drink named
Murray?
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: unique values of a Dictionary list (removing duplicate elements of a list)

2010-05-21 Thread Peter Otten
Chad Kellerman wrote:

 Python users,
   I am parsing an AIX trace file and creating a dictionary containing
 keys (PIDS) and values (a list of TIDS).  With PIDS being unique process
 ids
 and TIDS, being a list of thread ids.  My function populates the keys so
 that they are unique, but my list contains duplicates.
 
  Can someone point me in the right direction so that my dictionary
  value
 does not contain duplicate elements?
 
 
 here is what I got.
 
 --portion of code that is relevant--
 
 pidtids  = {}
 
 # --- function to add pid and tid to a dictionary
 def addpidtids(pid,tid):
 pidtids.setdefault(pid,[]).append(tid)

Use a set instead of a list (and maybe a defaultdict):

from collections import defaultdict

pidtids = defaultdict(set)

def addpidtids(pid, tid):
pidtids[pid].add(tid)

Peter

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: unique values of a Dictionary list (removing duplicate elements of a list)

2010-05-21 Thread Chad Kellerman
On Fri, May 21, 2010 at 7:50 AM, Peter Otten __pete...@web.de wrote:

 Chad Kellerman wrote:

  Python users,
I am parsing an AIX trace file and creating a dictionary containing
  keys (PIDS) and values (a list of TIDS).  With PIDS being unique process
  ids
  and TIDS, being a list of thread ids.  My function populates the keys so
  that they are unique, but my list contains duplicates.
 
   Can someone point me in the right direction so that my dictionary
   value
  does not contain duplicate elements?
 
 
  here is what I got.
 
  --portion of code that is relevant--
 
  pidtids  = {}
 
  # --- function to add pid and tid to a dictionary
  def addpidtids(pid,tid):
  pidtids.setdefault(pid,[]).append(tid)

 Use a set instead of a list (and maybe a defaultdict):

 from collections import defaultdict

 pidtids = defaultdict(set)

 def addpidtids(pid, tid):
pidtids[pid].add(tid)

 Peter


Thanks.  I guess I should have posted this in my original question.

I'm on 2.4.3  looks like defautldict is new in 2.5.

I'll see if I can upgrade.

Thanks again.




 --
 http://mail.python.org/mailman/listinfo/python-list




-- 
A grasshopper walks into a bar and the bartender says Hey, we have a drink
named after you. And the grasshopper says Really, You have a drink named
Murray?
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: unique values of a Dictionary list (removing duplicate elements of a list)

2010-05-21 Thread Chad Kellerman
On Fri, May 21, 2010 at 8:07 AM, Chad Kellerman sunck...@gmail.com wrote:



 On Fri, May 21, 2010 at 7:50 AM, Peter Otten __pete...@web.de wrote:

 Chad Kellerman wrote:

  Python users,
I am parsing an AIX trace file and creating a dictionary
 containing
  keys (PIDS) and values (a list of TIDS).  With PIDS being unique process
  ids
  and TIDS, being a list of thread ids.  My function populates the keys so
  that they are unique, but my list contains duplicates.
 
   Can someone point me in the right direction so that my dictionary
   value
  does not contain duplicate elements?
 
 
  here is what I got.
 
  --portion of code that is relevant--
 
  pidtids  = {}
 
  # --- function to add pid and tid to a dictionary
  def addpidtids(pid,tid):
  pidtids.setdefault(pid,[]).append(tid)

 Use a set instead of a list (and maybe a defaultdict):

 from collections import defaultdict

 pidtids = defaultdict(set)

 def addpidtids(pid, tid):
pidtids[pid].add(tid)

 Peter


 Thanks.  I guess I should have posted this in my original question.

 I'm on 2.4.3  looks like defautldict is new in 2.5.

 I'll see if I can upgrade.

 Thanks again.



 instead of upgrading.. (probably be faster to use techniques in available
2.4.3)

Couldn't I check to see if the pid exists (has_key I believe) and then check
if the tid is a value, in the the list for that key, prior to passing it to
the function?

Or would that be too 'expensive'?






 --

 http://mail.python.org/mailman/listinfo/python-list




 --
 A grasshopper walks into a bar and the bartender says Hey, we have a drink
 named after you. And the grasshopper says Really, You have a drink named
 Murray?




-- 
A grasshopper walks into a bar and the bartender says Hey, we have a drink
named after you. And the grasshopper says Really, You have a drink named
Murray?
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: unique values of a Dictionary list (removing duplicate elements of a list)

2010-05-21 Thread Peter Otten
Chad Kellerman wrote:

 On Fri, May 21, 2010 at 8:07 AM, Chad Kellerman sunck...@gmail.com
 wrote:
 


 On Fri, May 21, 2010 at 7:50 AM, Peter Otten __pete...@web.de wrote:

 Chad Kellerman wrote:

  Python users,
I am parsing an AIX trace file and creating a dictionary
 containing
  keys (PIDS) and values (a list of TIDS).  With PIDS being unique
  process ids
  and TIDS, being a list of thread ids.  My function populates the keys
  so that they are unique, but my list contains duplicates.
 
   Can someone point me in the right direction so that my dictionary
   value
  does not contain duplicate elements?
 
 
  here is what I got.
 
  --portion of code that is relevant--
 
  pidtids  = {}
 
  # --- function to add pid and tid to a dictionary
  def addpidtids(pid,tid):
  pidtids.setdefault(pid,[]).append(tid)

 Use a set instead of a list (and maybe a defaultdict):

 from collections import defaultdict

 pidtids = defaultdict(set)

 def addpidtids(pid, tid):
pidtids[pid].add(tid)

 Peter


 Thanks.  I guess I should have posted this in my original question.

 I'm on 2.4.3  looks like defautldict is new in 2.5.

 I'll see if I can upgrade.

 Thanks again.

 
 
  instead of upgrading.. (probably be faster to use techniques in available
 2.4.3)
 
 Couldn't I check to see if the pid exists (has_key I believe) and then
 check if the tid is a value, in the the list for that key, prior to
 passing it to the function?
 
 Or would that be too 'expensive'?

No.

pidtids = {}
def addpidtids(pid, tid):
if pid in pidtids:
pidtids[pid].add(tid)
else:
pidtids[pid] = set((tid,))

should be faster than

def addpidtids(pid, tid):
pidtids.setdefault(pid, set()).add(tid)

and both should work in python2.4.

Peter

-- 
http://mail.python.org/mailman/listinfo/python-list