Re: grab dict keys/values without iterating ?!
Hi Dave! You were absolutely right. I don't want to iterate the entire dict to get me the key/values Let us say this dict would have 20.000 entries, but I want only those with Aa to be grabed. Those starting with these 2 letters would be only 5 or 6 then it would take a lot of time. In which way would you prefer to store the data, and which functions or methods would you use effectively to accomplish this task ? I deeply apologize of not defining the question more defined. English is not my mother tongue. I'll do my best next time. Thanks Tamer On 11.12.2013 06:47, Dave Angel wrote: On Wed, 11 Dec 2013 02:02:20 +0200, Tamer Higazi tamerito...@arcor.de wrote: Is there a way to get dict by search terms without iterating the entire dictionary ?! I want to grab the dict's key and values started with 'Ar'... Your wording is so ambiguous that each respondent has guessed differently. I'm guessing that you want all key/value pairs for which the key begins with the two letters 'Ar' I'm guessing further that your objection to iterating the entire dictionary is not code size but performance. If both assumptions are valid then I'll point out that a dict has no ordering to it. If you want an approach that doesn't iterate over the entire structure you'll need to store the data differently. For example if you stored all the keys in a sorted list you could use bisect. -- https://mail.python.org/mailman/listinfo/python-list
Re: grab dict keys/values without iterating ?!
Tamer Higazi wrote: Hi Dave! You were absolutely right. I don't want to iterate the entire dict to get me the key/values Let us say this dict would have 20.000 entries, but I want only those with Aa to be grabed. Those starting with these 2 letters would be only 5 or 6 then it would take a lot of time. In which way would you prefer to store the data, and which functions or methods would you use effectively to accomplish this task ? Well, Dave already gave one approach: [Dave Angel] For example if you stored all the keys in a sorted list you could use bisect. See also http://docs.python.org/dev/library/bisect.html Another option would be to use a database. Assuming the table 'lookup' has two columns 'key' and 'value' you'd get the matching rows with select key, value from lookup where key like 'Aa%'; A lightweight database that comes with Python is sqlite: http://docs.python.org/dev/library/sqlite3.html http://www.sqlite.org/ -- https://mail.python.org/mailman/listinfo/python-list
Re: grab dict keys/values without iterating ?!
Hi Peter! I got the message I know that I could have used a database. I am using for a good reason the ZODB Database. I am making things in the ZODB Database persistent, I don't like to distribute among machines. Making use of sqlite, won't give me the possibility to scale as the amount of data and requests are high. I am thinking of scalability. Of course I could use the MongoDB as well. But this is from my side for THESE KIND of things not desired. I am thinking of making Files through objects in the ZODB Database persistent, and relational databases on long time make me sick I will workout a bselect sollution for my problem! Thanks for your support. Tamer On 11.12.2013 14:10, Peter Otten wrote: Tamer Higazi wrote: Hi Dave! You were absolutely right. I don't want to iterate the entire dict to get me the key/values Let us say this dict would have 20.000 entries, but I want only those with Aa to be grabed. Those starting with these 2 letters would be only 5 or 6 then it would take a lot of time. In which way would you prefer to store the data, and which functions or methods would you use effectively to accomplish this task ? Well, Dave already gave one approach: [Dave Angel] For example if you stored all the keys in a sorted list you could use bisect. See also http://docs.python.org/dev/library/bisect.html Another option would be to use a database. Assuming the table 'lookup' has two columns 'key' and 'value' you'd get the matching rows with select key, value from lookup where key like 'Aa%'; A lightweight database that comes with Python is sqlite: http://docs.python.org/dev/library/sqlite3.html http://www.sqlite.org/ -- https://mail.python.org/mailman/listinfo/python-list
Re: grab dict keys/values without iterating ?!
Reordering to un-top-post. On 11.12.2013 06:47, Dave Angel wrote: On Wed, 11 Dec 2013 02:02:20 +0200, Tamer Higazi wrote: Is there a way to get dict by search terms without iterating the entire dictionary ?! I want to grab the dict's key and values started with 'Ar'... Your wording is so ambiguous that each respondent has guessed differently. I'm guessing that you want all key/value pairs for which the key begins with the two letters 'Ar' I'm guessing further that your objection to iterating the entire dictionary is not code size but performance. If both assumptions are valid then I'll point out that a dict has no ordering to it. If you want an approach that doesn't iterate over the entire structure you'll need to store the data differently. For example if you stored all the keys in a sorted list you could use bisect. On Wednesday, December 11, 2013 3:37:08 PM UTC+5:30, Tamer Higazi wrote: Hi Dave! You were absolutely right. I don't want to iterate the entire dict to get me the key/values Let us say this dict would have 20.000 entries, but I want only those with Aa to be grabed. Those starting with these 2 letters would be only 5 or 6 then it would take a lot of time. In which way would you prefer to store the data, and which functions or methods would you use effectively to accomplish this task ? The classic data structure for this is the trie: General idea: http://en.wikipedia.org/wiki/Trie In python: http://stackoverflow.com/questions/11015320/how-to-create-a-trie-in-python/ I deeply apologize of not defining the question more defined. English is not my mother tongue. I'll do my best next time. English no issue. But better not to top-post http://en.wikipedia.org/wiki/Posting_style#Top-posting -- https://mail.python.org/mailman/listinfo/python-list
Re: grab dict keys/values without iterating ?!
On Wed, 11 Dec 2013 12:07:08 +0200, Tamer Higazi wrote: Hi Dave! You were absolutely right. I don't want to iterate the entire dict to get me the key/values Let us say this dict would have 20.000 entries, but I want only those with Aa to be grabed. Those starting with these 2 letters would be only 5 or 6 then it would take a lot of time. What do you mean by a lot of time? Here is a small test. I set up a dict with 456976 keys, and then iterate over them in just over a quarter of a second on my (old, slow) computer. Here is the code I use: data = {} letters = abcdefghijklmnopqrstuvwxyz for a in letters.upper(): for b in letters: for c in letters: for d in letters: key = a + b + c + d data[key] = None print(len(data)) count = 0 with Timer(): for key in data: if key.startswith(Aa): count += 1 print(Found %d keys starting with 'Aa') The Timer() function is not standard to Python, but you can find it here: http://code.activestate.com/recipes/577896 Are you sure that just using a normal dict will be too slow? In which way would you prefer to store the data, and which functions or methods would you use effectively to accomplish this task ? I would use a dict, and iterate over the keys, until such time that I new that iterating was the bottle-neck causing my code to be too slow. Until I knew that absolutely for sure, I would not optimize. If necessary, I would consider having 26 dicts, one for each initial letter: data = {} for c in ABCDEFGHIJKLMNOPQRSTUVWXYZ: data[c] = {} then store keys in the particular dict. That way, if I wanted keys starting with Aa, I would only search the A dict, not the B dict, C dict, etc. key = Aardvark data[key[0]][key] = some value -- Steven -- https://mail.python.org/mailman/listinfo/python-list
Re: grab dict keys/values without iterating ?!
On 2013-12-11 13:44, Steven D'Aprano wrote: If necessary, I would consider having 26 dicts, one for each initial letter: data = {} for c in ABCDEFGHIJKLMNOPQRSTUVWXYZ: data[c] = {} then store keys in the particular dict. That way, if I wanted keys starting with Aa, I would only search the A dict, not the B dict, C dict, etc. That's what the convoluted code does that I put at the end of my previous post in this thread, only to the Nth degree (the outermost dict has the first letter which links to a dictionary of the 2nd level/letter, to the 3rd level/letter, etc). -tkc -- https://mail.python.org/mailman/listinfo/python-list
Re: grab dict keys/values without iterating ?!
In article 3efc283f-419d-41b6-ad20-c2901c3b9...@googlegroups.com, rusi rustompm...@gmail.com wrote: The classic data structure for this is the trie: General idea: http://en.wikipedia.org/wiki/Trie In python: http://stackoverflow.com/questions/11015320/how-to-create-a-trie-in-python/ I agree that a trie fits the problem description well. If I were coding this up in C or C++, that's probably the data structure I'd use, with each node containing an array of pointers to nodes, and the array indexed by the ascii value of the next character (this doesn't translate well to unicode). Lookup speed would be awesomely fast. The problem is, that doesn't make a whole lot of sense in Python. The cited implementation uses dicts at each level. By the time you've done that, you might as well just throw all the data into one big dict and use the full search string as the key. It would be a lot less code, and probably run faster. Still, as an exercise in learning about a useful (and under-appreciated) data structure, implementing this as a trie sounds like a wonderful idea. -- https://mail.python.org/mailman/listinfo/python-list
Re: grab dict keys/values without iterating ?!
On Wednesday, December 11, 2013 8:16:12 PM UTC+5:30, Roy Smith wrote: rusi wrote: The classic data structure for this is the trie: General idea: http://en.wikipedia.org/wiki/Trie In python: http://stackoverflow.com/questions/11015320/how-to-create-a-trie-in-python/ I agree that a trie fits the problem description well. If I were coding this up in C or C++, that's probably the data structure I'd use, with each node containing an array of pointers to nodes, and the array indexed by the ascii value of the next character (this doesn't translate well to unicode). Lookup speed would be awesomely fast. The problem is, that doesn't make a whole lot of sense in Python. The cited implementation uses dicts at each level. By the time you've done that, you might as well just throw all the data into one big dict and use the full search string as the key. It would be a lot less code, and probably run faster. Still, as an exercise in learning about a useful (and under-appreciated) data structure, implementing this as a trie sounds like a wonderful idea. I was going to say that Steven's data = {} for c in ABCDEFGHIJKLMNOPQRSTUVWXYZ: data[c] = {} is really the first step of the trie approach except that instead of key = Aardvark data[key[0]][key] = some value he would need data[key[0]][key[1:] = some value The catch is keys of length 1 eg A Which makes the fully general (recursively unfolded) version easier to write, though with all those zillions of dicts probably not very efficient. Finitizing the trie into fixed small prefixes with leaves containing standard dicts looks sensible (to me without and testing of either efficiency or readability!) The short prefix problem however remains Of course there are other tricks possible: If we know that the data is case-insensitive alphabetic ASCII* we can just normalize the data with def norm(s): return [ord(c.upper()) -ord('A') for c in s] then norm(aBcD) [0, 1, 2, 3] and instead of dicts we can use 26-length lists * O me O my! Ive used the terrible A-word! Now RUN before the trolls get us! -- https://mail.python.org/mailman/listinfo/python-list
Re: grab dict keys/values without iterating ?!
On 2013-12-11 09:46, Roy Smith wrote: The problem is, that doesn't make a whole lot of sense in Python. The cited implementation uses dicts at each level. By the time you've done that, you might as well just throw all the data into one big dict and use the full search string as the key. It would be a lot less code, and probably run faster. You're right if the search term is a whole word, a single dict-to-result works ideally. However, the OP asked about prefixes, so the Python implementation I provided uses a dict-of-nested-dicts which allows any arbitrary prefix, and then iterates over only the subset of those that match. If you need O(length-of-prefix) iteration of all results, I believe that's the best way algorithm to use (implementation details could differ for a more space-efficient structure perhaps; normalization might help reduce dict-entries). It's a specialized use-case, and doesn't have the O(1) lookup for exact-matches (that's just a special case of prefix=word with no sub-iteration, so would be O(length-of-search-word)). -tkc -- https://mail.python.org/mailman/listinfo/python-list
Re: grab dict keys/values without iterating ?!
On 11/12/2013 00:02, Tamer Higazi wrote: Hi people! Is there a way to get dict by search terms without iterating the entire dictionary ?! Let us assume I have: {'Amanda':'Power','Amaly':'Higgens','Joseph':'White','Arlington','Black','Arnold','Schwarzenegger'} I want to grab the dict's key and values started with 'Ar'... I could make an iterator and look if it's inside. I wasn't able to find it, but I am asking myself if dict has a builtin method to get me these key/values on the fly. Why do I ask you?! I am working with the ZODB Database, where I make use of a PersistentDict and PersistentList, and I want I would thank you for a short reply. Tamer Plenty of answers already but this may be of interest http://stackoverflow.com/questions/2288901/best-data-structure-for-crossword-puzzle-search?lq=1 -- My fellow Pythonistas, ask not what our language can do for you, ask what you can do for our language. Mark Lawrence -- https://mail.python.org/mailman/listinfo/python-list
Re: grab dict keys/values without iterating ?!
On Dec 11, 2013, at 5:31 AM, rusi rustompm...@gmail.com wrote: The classic data structure for this is the trie: General idea: http://en.wikipedia.org/wiki/Trie In python: http://stackoverflow.com/questions/11015320/how-to-create-a-trie-in-python/ My thoughts exactly! If you wade through the comments there, someone has done a more-than-naive implementation here: https://github.com/kmike/marisa-trie The write up makes it look pretty favorable as well for performance (scroll 2/3s down to the Benchmarks section). -- https://mail.python.org/mailman/listinfo/python-list
Re: grab dict keys/values without iterating ?!
On 11/12/2013 17:19, Travis Griggs wrote: On Dec 11, 2013, at 5:31 AM, rusi rustompm...@gmail.com wrote: The classic data structure for this is the trie: General idea: http://en.wikipedia.org/wiki/Trie In python: http://stackoverflow.com/questions/11015320/how-to-create-a-trie-in-python/ My thoughts exactly! If you wade through the comments there, someone has done a more-than-naive implementation here: https://github.com/kmike/marisa-trie The write up makes it look pretty favorable as well for performance (scroll 2/3s down to the Benchmarks section). http://kmike.ru/python-data-structures/ from the author of the above is well worth a read. -- My fellow Pythonistas, ask not what our language can do for you, ask what you can do for our language. Mark Lawrence -- https://mail.python.org/mailman/listinfo/python-list
Re: grab dict keys/values without iterating ?!
On Wed, Dec 11, 2013 at 7:30 AM, Tim Chase python.l...@tim.thechases.com wrote: On 2013-12-11 13:44, Steven D'Aprano wrote: If necessary, I would consider having 26 dicts, one for each initial letter: data = {} for c in ABCDEFGHIJKLMNOPQRSTUVWXYZ: data[c] = {} then store keys in the particular dict. That way, if I wanted keys starting with Aa, I would only search the A dict, not the B dict, C dict, etc. That's what the convoluted code does that I put at the end of my previous post in this thread, only to the Nth degree (the outermost dict has the first letter which links to a dictionary of the 2nd level/letter, to the 3rd level/letter, etc). This is what I did not so long ago when writing a utility for typeahead lookup, except that to save some space and time I only nested the dicts as deeply as there were still multiple entries. As an example of what the data structure looked like: lookups = { 'a': { 'l': { 'g': 'algebra', 'p': 'alphanumeric', }, 's': 'asterisk', }, 'b': 'bobcat', ... } It does make the update process more complicated though, as adding new words potentially requires existing words to be nested deeper than they are currently. -- https://mail.python.org/mailman/listinfo/python-list
Re: grab dict keys/values without iterating ?!
On Wed, Dec 11, 2013 at 6:02 PM, Ian Kelly ian.g.ke...@gmail.com wrote: This is what I did not so long ago when writing a utility for typeahead lookup, except that to save some space and time I only nested the dicts as deeply as there were still multiple entries. As an example of what the data structure looked like: lookups = { 'a': { 'l': { 'g': 'algebra', 'p': 'alphanumeric', }, 's': 'asterisk', }, 'b': 'bobcat', ... } It does make the update process more complicated though, as adding new words potentially requires existing words to be nested deeper than they are currently. And I'm simplifying that a bit, because I also included at each node the preferred (in my case, the first alphabetically) completion for that prefix, to avoid the need to iterate over the subtree. -- https://mail.python.org/mailman/listinfo/python-list
Re: grab dict keys/values without iterating ?!
On Tue, Dec 10, 2013 at 4:02 PM, Tamer Higazi tamerito...@arcor.de wrote: Hi people! Is there a way to get dict by search terms without iterating the entire dictionary ?! Let us assume I have: {'Amanda':'Power','Amaly':'Higgens','Joseph':'White',' Arlington','Black','Arnold','Schwarzenegger'} I want to grab the dict's key and values started with 'Ar'... I could make an iterator and look if it's inside. I wasn't able to find it, but I am asking myself if dict has a builtin method to get me these key/values on the fly. Why do I ask you?! I am working with the ZODB Database, where I make use of a PersistentDict and PersistentList, and I want I would thank you for a short reply. A treap or red-black tree could be made to do this in O(logn) time. They can be made to look like dictionaries, but they keep things ordered by key every step of the way. I can tell you right now though, that the following two implementations aren't quite all the way there for this task: https://pypi.python.org/pypi/treap https://pypi.python.org/pypi/red-black-tree-mod The treap implementation doesn't have a successor method and won't do a prefix search out of the box. The red-black tree implementation has a successor method, but won't do a prefix search out of the box. I believe you want something with both successor and prefix search. And if you want to work on adding this to either of the aforementioned modules, I'm interested in merging your patch. ^_^ There may be other implementations that'll do this already; I don't know about that. Other things that may help are BTrees and things of that ilk. Finally, if your dictionaries aren't very big, you might find it easier to just get a list of keys and do a binary search on them. It's O(n) (assuming you do it only once - doing it in a loop would be expensive), but sometimes that's not a problem. -- https://mail.python.org/mailman/listinfo/python-list
grab dict keys/values without iterating ?!
Hi people! Is there a way to get dict by search terms without iterating the entire dictionary ?! Let us assume I have: {'Amanda':'Power','Amaly':'Higgens','Joseph':'White','Arlington','Black','Arnold','Schwarzenegger'} I want to grab the dict's key and values started with 'Ar'... I could make an iterator and look if it's inside. I wasn't able to find it, but I am asking myself if dict has a builtin method to get me these key/values on the fly. Why do I ask you?! I am working with the ZODB Database, where I make use of a PersistentDict and PersistentList, and I want I would thank you for a short reply. Tamer -- https://mail.python.org/mailman/listinfo/python-list
Re: grab dict keys/values without iterating ?!
On 11/12/2013 00:02, Tamer Higazi wrote: Hi people! Is there a way to get dict by search terms without iterating the entire dictionary ?! Let us assume I have: {'Amanda':'Power','Amaly':'Higgens','Joseph':'White','Arlington','Black','Arnold','Schwarzenegger'} I want to grab the dict's key and values started with 'Ar'... I could make an iterator and look if it's inside. I wasn't able to find it, but I am asking myself if dict has a builtin method to get me these key/values on the fly. Why do I ask you?! I am working with the ZODB Database, where I make use of a PersistentDict and PersistentList, and I want I would thank you for a short reply. There isn't a special shortcut because it's already a short journey. :-) -- https://mail.python.org/mailman/listinfo/python-list
Re: grab dict keys/values without iterating ?!
On 2013-12-11 02:02, Tamer Higazi wrote: Is there a way to get dict by search terms without iterating the entire dictionary ?! Let us assume I have: {'Amanda':'Power','Amaly':'Higgens','Joseph':'White','Arlington','Black','Arnold','Schwarzenegger'} ^ Assuming this is a colon... I want to grab the dict's key and values started with 'Ar'... I could make an iterator and look if it's inside. I wasn't able to find it, but I am asking myself if dict has a builtin method to get me these key/values on the fly. I would thank you for a short reply. No. If you want a longer reply, keep reading. :-) There are a couple ways you could get the results you want, but it depends on what you want to test. If you just want to test the presence/absence of something matching your pattern, you can bail early with if any(k.startswith(s) or v.startswith(s) for k,v in d.iteritems()): yep() else: nope() which will only pull items from the iterator until the first one is found. If you want to find all the results, you have to look at all the results, either at search time, or at the time of dictionary construction (although in this case, you'd also need to know the search term at the time of construction). That might look something like d = {} interesting = [] def is_interesting(term, *args): return any(s.startswith(term) for s in args) for key, value, other in data_source: if is_interesting(term, key, value): interesting.append((key, value)) d[key] = value for key, value in interesting: just_iterating_over_those_not_all(key, value) Alternatively, you could use a different data-structure, something like (untested) SENTINEL = object() def build_structure(i): # i is some iterable of key/value pairs # such as from {}.iteritems() root = {} for key, value in i: loc = root for letter in key: loc = loc.setdefault(letter, {}) loc[SENTINEL] = value return root def just_matches(root, pattern): if pattern: first_letter = pattern[0] if first_letter in root: for result in just_matches(root[first_letter], pattern[1:]): yield result else: # show all sub-items for k,v in root.iteritems(): if k is SENTINEL: yield v else: for result in just_matches(v, None): yield result s = build_structure(data.iteritems()) # print repr(s) for result in just_matches(s, Ar): print result I'm not sure this is a good trade-off, but it does limit your iteration to only those that begin with your chosen search term, rather than all items in the dictionary once the initial data-structure is built. (I think in the span of 20 minutes, I managed to recreate a college homework assignment that took me nearly a week in C) -tkc -- https://mail.python.org/mailman/listinfo/python-list
Re: grab dict keys/values without iterating ?!
Tamer Higazi tamerito...@arcor.de writes: Is there a way to get dict by search terms without iterating the entire dictionary ?! (A language note: you may be unaware that “?!” does not connote a simple question, but outrage or incredulity or some other indignant expression. This implies not a polite query, but more a harsh demand for the other person to explain. I think your questions will communicate better punctuated simply with “?”.) I want to grab the dict's key and values started with 'Ar'... surnames_by_givenname = { Amanda: Power, Amaly: Higgens, Joseph: White, Arlington: Black, Arnold: Schwarzenegger, } items_whose_keys_start_with_Ar = [ (key, value) for (key, value) in surnames_by_givenname.items() if key.startswith(Ar)] I could make an iterator and look if it's inside. I wasn't able to find it, but I am asking myself if dict has a builtin method to get me these key/values on the fly. Not a method on the dict, but a method on the string and a list comprehension (or, if you prefer, use a generator expression or dict comprehension, etc.). Why do I ask you?! I am working with the ZODB Database, where I make use of a PersistentDict and PersistentList, and I want I'll be interested to know the rest of that paragraph, to know whether the above list comprehension meets your constraints. -- \ “I was in the first submarine. Instead of a periscope, they had | `\ a kaleidoscope. ‘We're surrounded.’” —Steven Wright | _o__) | Ben Finney -- https://mail.python.org/mailman/listinfo/python-list
Re: grab dict keys/values without iterating ?!
On Wed, 11 Dec 2013 02:02:20 +0200, Tamer Higazi tamerito...@arcor.de wrote: Is there a way to get dict by search terms without iterating the entire dictionary ?! I want to grab the dict's key and values started with 'Ar'... Your wording is so ambiguous that each respondent has guessed differently. I'm guessing that you want all key/value pairs for which the key begins with the two letters 'Ar' I'm guessing further that your objection to iterating the entire dictionary is not code size but performance. If both assumptions are valid then I'll point out that a dict has no ordering to it. If you want an approach that doesn't iterate over the entire structure you'll need to store the data differently. For example if you stored all the keys in a sorted list you could use bisect. -- DaveA -- https://mail.python.org/mailman/listinfo/python-list