Daniel Kersten wrote: > PS: So, yes, they are readonly. > > 2008/12/15 Daniel Kersten <[email protected]>: > >> Ok, I'll give a few more details as to what I'm doing. >> >> Basically, I have a little python app which analyses log files (these >> log files are large, I have one here thats incomplete and is already >> 200MB). Each entry contains a number of fields which I package into >> convenient little objects. >> >> The objects represent "messages" and the fields are addresses of those >> messages (transaction id's etc) and I need to verify that if I get >> message of type A that I then receive a message of type B with >> matching transaction id's and address X in range Y... you get the >> idea, I hope. >> >> I could use sqlite for this (and that might even be a good solution), >> though I'd like to keep it in plain python, if possible, since its >> meant to just be a little script which I can run over the log files on >> whatever machine it happens to be on, though I may settle for using >> sqlite if theres no alternative. >> >> >> 2008/12/15 Juan Hernandez Gomez <[email protected]>: >> >>> Hi, >>> >>> you could create an SQLite table with the properties you want to filter as >>> columns and an extra column with the index of the object in your large list >>> (if not the full object). >>> Then you have the full power of SQL and you can create indexes as needed. Is >>> quite flexible. >>> >>> You haven't said if the list of objects can be updated or is just readonly. >>> >>> Juan >>> >>> >>> Daniel Kersten wrote: >>> >>> Hi all, >>> >>> I have a large list of objects which I'd like to filter on various criteria. >>> For example, I'd like to do something like: >>> give me all objects o where o.a == "A" and o.b == "B" and o.c in [...] >>> >>> I thought of storing references to these objects in dictionaries, so >>> that I can look them up by their values (eg dict_of_a would contain >>> all objects where its value is the object and the key is that objects >>> value of 'a', this way if I do dict_of_a[o.a] I get back [o] (or more >>> elements, if other objects have the same value)) and then look up each >>> field and then perform a set union to get all objects which match the >>> desired criteria (though this doesn't work for the `in` operator). I >>> hope that made sense. >>> >>> The problem is that I have a large list of these objects (well over >>> 100k) and I was wondering if there was a better way of doing this? >>> Perhaps a super-efficient built in query object?? anything? >>> >>> I'm probably doing it wrong anyway, so any tips or ideas to push me >>> towards a proper solution would be greatly appreciated. >>> >>> Thanks, >>> Dan. >>> >>> >>> >>> >> >> -- >> Daniel Kersten. >> Leveraging dynamic paradigms since the synergies of 1985. >> >> > > > > Could you use a string of generators to handle the filtering process? So if your objects (logobj) were instantiated from a line of the log file, then:
fp = open(yourlog, 'r') events = (logobj(x) for x in fp) match_one = (e for e in events if e.a == 'A') match_two = (m for m in match_one if m.b == 'B') would result in the generator match_two which would provide objects with o.a == 'A' and o.b == 'B'. You can reuse or reorganise the generators as necessary to perform the desired filtering. Padraig --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Python Ireland" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.ie/group/pythonireland?hl=en -~----------~----~----~----~------~----~------~--~---
