Re: RE: Fast full-text searching in Python (job for Whoosh?)
On 3/7/2023 2:02 PM, avi.e.gr...@gmail.com wrote: Some of the discussions here leave me confused as the info we think we got early does not last long intact and often morphs into something else and we find much of the discussion is misdirected or wasted. Apologies. I'm the OP and also the OS (original sinner). My "mistake" was to go for a "stream of consciousness" kind of question, rather than a well researched and thought out one. You are correct, Avi. I have a simple web UI, I came across the Whoosh video and got infatuated with the idea that Whoosh could be used for create a autofill function, as my backend is already Python/Flask. As many have observed and as I have also quickly realized, Whoosh was overkill for my use case. In the meantime people started asking questions, I responded and, before you know it, we are all discussing the intricacies of JavaScript web development in a Python forum. Should I have stopped them? How? One thing is for sure: I am really grateful that so many used so much of their time to help. A big thank you to each of you, friends. Dino -- https://mail.python.org/mailman/listinfo/python-list
Re: Fast full-text searching in Python (job for Whoosh?)
On 3/7/2023 1:28 PM, David Lowry-Duda wrote: But I'll note that I use whoosh from time to time and I find it stable and pleasant to work with. It's true that development stopped, but it stopped in a very stable place. I don't recommend using whoosh here, but I would recommend experimenting with it more generally. Thank you, David. Noted. -- https://mail.python.org/mailman/listinfo/python-list
Re: Fast full-text searching in Python (job for Whoosh?)
On 3/6/2023 11:05 PM, rbowman wrote: It must be nice to have a server or two... No kidding About everything else you wrote, it makes a ton of sense, in fact it's a dilemma I am facing now. My back-end returns 10 entries (I am limiting to max 10 matches server side for reasons you can imagine). As the user keeps typing, should I restrict the existing result set based on the new information or re-issue a API call to the server? Things get confusing pretty fast for the user. You don't want too many cooks in kitchen, I guess. Played a little bit with both approaches in my little application. Re-requesting from the server seems to win hands down in my case. I am sure that them google engineers reached spectacular levels of UI finesse with stuff like this. On Mon, 6 Mar 2023 21:55:37 -0500, Dino wrote: https://schier.co/blog/wait-for-user-to-stop-typing-using-javascript That could be annoying. My use case is address entry. When the user types 102 ma the suggestions might be main manson maple massachusetts masten in a simple case. When they enter 's' it's narrowed down. Typically I'm only dealing with a city or county so the data to be searched isn't huge. The maps.google.com address search covers the world and they're also throwing in a geographical constraint so the suggestions are applicable to the area you're viewing. -- https://mail.python.org/mailman/listinfo/python-list
Re: Fast full-text searching in Python (job for Whoosh?)
On 3/4/2023 10:43 PM, Dino wrote: I need fast text-search on a large (not huge, let's say 30k records totally) list of items. Here's a sample of my raw data (a list of US cars: model and make) Gentlemen, thanks a ton to everyone who offered to help (and did help!). I loved the part where some tried to divine the true meaning of my words :) What you guys wrote is correct: the grep-esque search is guaranteed to turn up a ton of false positives, but for the autofill use-case, that's actually OK. Users will quickly figure what is not relevant and skip those entries, just to zero on in on the suggestion that they find relevant. One issue that was also correctly foreseen by some is that there's going to be a new request at every user key stroke. Known problem. JavaScript programmers use a trick called "debounceing" to be reasonably sure that the user is done typing before a request is issued: https://schier.co/blog/wait-for-user-to-stop-typing-using-javascript I was able to apply that successfully and I am now very pleased with the final result. Apologies if I posted 1400 lines or data file. Seeing that certain newsgroups carry gigabytes of copyright infringing material must have conveyed the wrong impression to me. Thank you. Dino -- https://mail.python.org/mailman/listinfo/python-list
Re: Fast full-text searching in Python (job for Whoosh?)
On 3/5/2023 9:05 PM, Thomas Passin wrote: I would probably ingest the data at startup into a dictionary - or perhaps several depending on your access patterns - and then you will only need to to a fast lookup in one or more dictionaries. If your access pattern would be easier with SQL queries, load the data into an SQLite database on startup. Thank you. SQLite would be overkill here, plus all the machinery that I would need to set up to make sure that the DB is rebuilt/updated regularly. Do you happen to know something about Whoosh? have you ever used it? IOW, do the bulk of the work once at startup. Sound advice Thank you -- https://mail.python.org/mailman/listinfo/python-list
Re: RE: Fast full-text searching in Python (job for Whoosh?)
Thank you for taking the time to write such a detailed answer, Avi. And apologies for not providing more info from the get go. What I am trying to achieve here is supporting autocomplete (no pun intended) in a web form field, hence the -i case insensitive example in my initial question. Your points are all good, and my original question was a bit rushed. I guess that the problem was that I saw this video: https://www.youtube.com/watch?v=gRvZbYtwTeo&ab_channel=NextDayVideo The idea that someone types into an input field and matches start dancing in the browser made me think that this was exactly what I needed, and hence I figured that asking here about Whoosh would be a good idea. I know realize that Whoosh would be overkill for my use-case, as a simple (case insensitive) query substring would get me 90% of what I want. Speed is in the order of a few milliseconds out of the box, which is chump change in the context of a web UI. Thank you again for taking the time to look at my question Dino On 3/5/2023 10:56 PM, avi.e.gr...@gmail.com wrote: Dino, Sending lots of data to an archived forum is not a great idea. I snipped most of it out below as not to replicate it. Your question does not look difficult unless your real question is about speed. Realistically, much of the time spent generally is in reading in a file and the actual search can be quite rapid with a wide range of methods. The data looks boring enough and seems to not have much structure other than one comma possibly separating two fields. Do you want the data as one wide filed or perhaps in two parts, which a CSV file is normally used to represent. Do you ever have questions like tell me all cars whose name begins with the letter D and has a V6 engine? If so, you may want more than a vanilla search. What exactly do you want to search for? Is it a set of built-in searches or something the user types in? The data seems to be sorted by the first field and then by the second and I did not check if some searches might be ambiguous. Can there be many entries containing III? Yep. Can the same words like Cruiser or Hybrid appear? So is this a one-time search or multiple searches once loaded as in a service that stays resident and fields requests. The latter may be worth speeding up. I don't NEED to know any of this but want you to know that the answer may depend on this and similar factors. We had a long discussion lately on whether to search using regular expressions or string methods. If your data is meant to be used once, you may not even need to read the file into memory, but read something like a line at a time and test it. Or, if you end up with more data like how many cylinders a car has, it may be time to read it in not just to a list of lines or such data structures, but get numpy/pandas involved and use their many search methods in something like a data.frame. Of course if you are worried about portability, keep using Get Regular Expression Print. Your example was: $ grep -i v60 all_cars_unique.csv Genesis,GV60 Volvo,V60 You seem to have wanted case folding and that is NOT a normal search. And your search is matching anything on any line. If you wanted only a complete field, such as all text after a comma to the end of the line, you could use grep specifications to say that. But once inside python, you would need to make choices depending on what kind of searches you want to allow but also things like do you want all matching lines shown if you search for say "a" ... -- https://mail.python.org/mailman/listinfo/python-list
Re: Fast full-text searching in Python (job for Whoosh?)
On 3/5/2023 1:19 AM, Greg Ewing wrote: I just did a similar test with your actual data and got about the same result. If that's fast enough for you, then you don't need to do anything fancy. thank you, Greg. That's what I am going to do in fact. -- https://mail.python.org/mailman/listinfo/python-list
Re: Fast full-text searching in Python (job for Whoosh?)
Here's the complete data file should anyone care. Acura,CL Acura,ILX Acura,Integra Acura,Legend Acura,MDX Acura,MDX Sport Hybrid Acura,NSX Acura,RDX Acura,RL Acura,RLX Acura,RLX Sport Hybrid Acura,RSX Acura,SLX Acura,TL Acura,TLX Acura,TSX Acura,Vigor Acura,ZDX Alfa Romeo,164 Alfa Romeo,4C Alfa Romeo,4C Spider Alfa Romeo,Giulia Alfa Romeo,Spider Alfa Romeo,Stelvio Alfa Romeo,Tonale Aston Martin,DB11 Aston Martin,DB9 Aston Martin,DB9 GT Aston Martin,DBS Aston Martin,DBS Superleggera Aston Martin,DBX Aston Martin,Rapide Aston Martin,Rapide S Aston Martin,Vanquish Aston Martin,Vanquish S Aston Martin,Vantage Aston Martin,Virage Audi,100 Audi,80 Audi,90 Audi,A3 Audi,A3 Sportback e-tron Audi,A4 Audi,A4 (2005.5) Audi,A4 allroad Audi,A5 Audi,A5 Sport Audi,A6 Audi,A6 allroad Audi,A7 Audi,A8 Audi,Cabriolet Audi,Q3 Audi,Q4 Sportback e-tron Audi,Q4 e-tron Audi,Q5 Audi,Q5 Sportback Audi,Q7 Audi,Q8 Audi,Quattro Audi,R8 Audi,RS 3 Audi,RS 4 Audi,RS 5 Audi,RS 6 Audi,RS 7 Audi,RS Q8 Audi,RS e-tron GT Audi,S3 Audi,S4 Audi,S4 (2005.5) Audi,S5 Audi,S6 Audi,S7 Audi,S8 Audi,SQ5 Audi,SQ5 Sportback Audi,SQ7 Audi,SQ8 Audi,TT Audi,allroad Audi,e-tron Audi,e-tron GT Audi,e-tron S Audi,e-tron S Sportback Audi,e-tron Sportback BMW,1 Series BMW,2 Series BMW,3 Series BMW,4 Series BMW,5 Series BMW,6 Series BMW,7 Series BMW,8 Series BMW,Alpina B7 BMW,M BMW,M2 BMW,M3 BMW,M4 BMW,M5 BMW,M6 BMW,M8 BMW,X1 BMW,X2 BMW,X3 BMW,X3 M BMW,X4 BMW,X4 M BMW,X5 BMW,X5 M BMW,X6 BMW,X6 M BMW,X7 BMW,Z3 BMW,Z4 BMW,Z4 M BMW,Z8 BMW,i3 BMW,i4 BMW,i7 BMW,i8 BMW,iX Bentley,Arnage Bentley,Azure Bentley,Azure T Bentley,Bentayga Bentley,Brooklands Bentley,Continental Bentley,Continental GT Bentley,Flying Spur Bentley,Mulsanne Buick,Cascada Buick,Century Buick,Enclave Buick,Encore Buick,Encore GX Buick,Envision Buick,LaCrosse Buick,LeSabre Buick,Lucerne Buick,Park Avenue Buick,Rainier Buick,Regal Buick,Regal Sportback Buick,Regal TourX Buick,Rendezvous Buick,Riviera Buick,Roadmaster Buick,Skylark Buick,Terraza Buick,Verano Cadillac,ATS Cadillac,ATS-V Cadillac,Allante Cadillac,Brougham Cadillac,CT4 Cadillac,CT5 Cadillac,CT6 Cadillac,CT6-V Cadillac,CTS Cadillac,CTS-V Cadillac,Catera Cadillac,DTS Cadillac,DeVille Cadillac,ELR Cadillac,Eldorado Cadillac,Escalade Cadillac,Escalade ESV Cadillac,Escalade EXT Cadillac,Fleetwood Cadillac,LYRIQ Cadillac,SRX Cadillac,STS Cadillac,Seville Cadillac,Sixty Special Cadillac,XLR Cadillac,XT4 Cadillac,XT5 Cadillac,XT6 Cadillac,XTS Chevrolet,1500 Extended Cab Chevrolet,1500 Regular Cab Chevrolet,2500 Crew Cab Chevrolet,2500 Extended Cab Chevrolet,2500 HD Extended Cab Chevrolet,2500 HD Regular Cab Chevrolet,2500 Regular Cab Chevrolet,3500 Crew Cab Chevrolet,3500 Extended Cab Chevrolet,3500 HD Extended Cab Chevrolet,3500 HD Regular Cab Chevrolet,3500 Regular Cab Chevrolet,APV Cargo Chevrolet,Astro Cargo Chevrolet,Astro Passenger Chevrolet,Avalanche Chevrolet,Avalanche 1500 Chevrolet,Avalanche 2500 Chevrolet,Aveo Chevrolet,Beretta Chevrolet,Blazer Chevrolet,Blazer EV Chevrolet,Bolt EUV Chevrolet,Bolt EV Chevrolet,Camaro Chevrolet,Caprice Chevrolet,Caprice Classic Chevrolet,Captiva Sport Chevrolet,Cavalier Chevrolet,City Express Chevrolet,Classic Chevrolet,Cobalt Chevrolet,Colorado Crew Cab Chevrolet,Colorado Extended Cab Chevrolet,Colorado Regular Cab Chevrolet,Corsica Chevrolet,Corvette Chevrolet,Cruze Chevrolet,Cruze Limited Chevrolet,Equinox Chevrolet,Equinox EV Chevrolet,Express 1500 Cargo Chevrolet,Express 1500 Passenger Chevrolet,Express 2500 Cargo Chevrolet,Express 2500 Passenger Chevrolet,Express 3500 Cargo Chevrolet,Express 3500 Passenger Chevrolet,G-Series 1500 Chevrolet,G-Series 2500 Chevrolet,G-Series 3500 Chevrolet,G-Series G10 Chevrolet,G-Series G20 Chevrolet,G-Series G30 Chevrolet,HHR Chevrolet,Impala Chevrolet,Impala Limited Chevrolet,Lumina Chevrolet,Lumina APV Chevrolet,Lumina Cargo Chevrolet,Lumina Passenger Chevrolet,Malibu Chevrolet,Malibu (Classic) Chevrolet,Malibu Limited Chevrolet,Metro Chevrolet,Monte Carlo Chevrolet,Prizm Chevrolet,S10 Blazer Chevrolet,S10 Crew Cab Chevrolet,S10 Extended Cab Chevrolet,S10 Regular Cab Chevrolet,SS Chevrolet,SSR Chevrolet,Silverado (Classic) 1500 Crew Cab Chevrolet,Silverado (Classic) 1500 Extended Cab Chevrolet,Silverado (Classic) 1500 HD Crew Cab Chevrolet,Silverado (Classic) 1500 Regular Cab Chevrolet,Silverado (Classic) 2500 HD Crew Cab Chevrolet,Silverado (Classic) 2500 HD Extended Cab Chevrolet,Silverado (Classic) 2500 HD Regular Cab Chevrolet,Silverado (Classic) 3500 Crew Cab Chevrolet,Silverado (Classic) 3500 Extended Cab Chevrolet,Silverado (Classic) 3500 Regular Cab Chevrolet,Silverado 1500 Crew Cab Chevrolet,Silverado 1500 Double Cab Chevrolet,Silverado 1500 Extended Cab Chevrolet,Silverado 1500 HD Crew Cab Chevrolet,Silverado 1500 LD Double Cab Chevrolet,Silverado 1500 Limited Crew Cab Chevrolet,Silverado 1500 Limited Double Cab Chevrolet,Silverado 1500 Limited Regular Cab Chevrolet,Silverado 1500 Regular Cab Chevrolet,Silverado 2500 Crew Cab Chevrolet,Silverado 250
Fast full-text searching in Python (job for Whoosh?)
I need fast text-search on a large (not huge, let's say 30k records totally) list of items. Here's a sample of my raw data (a list of US cars: model and make) $ head all_cars_unique.csv\ Acura,CL Acura,ILX Acura,Integra Acura,Legend Acura,MDX Acura,MDX Sport Hybrid Acura,NSX Acura,RDX Acura,RL Acura,RLX $ wc -l all_cars_unique.csv 1415 all_cars_unique.csv $ grep -i v60 all_cars_unique.csv Genesis,GV60 Volvo,V60 $ Essentially, I want my input field to suggest autofill options with data from this file/list. The user types "v60" and a REST point will offer: [ {"model":"GV60", "manufacturer":"Genesis"}, {"model":"V60", "manufacturer":"Volvo"} ] i.e. a JSON response that I can use to generate the autofill with JavaScript. My Back-End is Python (Flask). How can I implement this? A library called Whoosh seems very promising (albeit it's so feature-rich that it's almost like shooting a fly with a bazooka in my case), but I see two problems: 1) Whoosh is either abandoned or the project is a mess in terms of community and support (https://groups.google.com/g/whoosh/c/QM_P8cGi4v4 ) and 2) Whoosh seems to be a Python only thing, which is great for now, but I wouldn't want this to become an obstacle should I need port it to a different language at some point. are there other options that are fast out there? Can I "grep" through a data structure in python... but faster? Thanks Dino -- https://mail.python.org/mailman/listinfo/python-list
Re: Fast full-text searching in Python (job for Whoosh?)
On 3/4/2023 10:43 PM, Dino wrote: I need fast text-search on a large (not huge, let's say 30k records totally) list of items. Here's a sample of my raw data (a list of US cars: model and make) I suspect I am really close to answering my own question... >>> import time >>> lis = [str(a**2+a*3+a) for a in range(0,3)] >>> s = time.process_time_ns(); res = [el for el in lis if "13467" in el]; print(time.process_time_ns() -s); 753800 >>> s = time.process_time_ns(); res = [el for el in lis if "52356" in el]; print(time.process_time_ns() -s); 1068300 >>> s = time.process_time_ns(); res = [el for el in lis if "5256" in el]; print(time.process_time_ns() -s); 862000 >>> s = time.process_time_ns(); res = [el for el in lis if "6" in el]; print(time.process_time_ns() -s); 1447300 >>> s = time.process_time_ns(); res = [el for el in lis if "1" in el]; print(time.process_time_ns() -s); 1511100 >>> s = time.process_time_ns(); res = [el for el in lis if "13467" in el]; print(time.process_time_ns() -s); print(len(res), res[:10]) 926900 2 ['134676021', '313467021'] >>> I can do a substring search in a list of 30k elements in less than 2ms with Python. Is my reasoning sound? Dino -- https://mail.python.org/mailman/listinfo/python-list
Re: LRU cache
Thank you, Gerard. I really appreciate your help Dino On 2/16/2023 9:40 PM, Weatherby,Gerard wrote: I think this does the trick: https://gist.github.com/Gerardwx/c60d200b4db8e7864cb3342dd19d41c9 #!/usr/bin/env python3 import collections import random from typing import Hashable, Any, Optional, Dict, Tuple class LruCache: """Dictionary like storage of most recently inserted values""" def __init__(self, size: int = 1000): """:param size number of cached entries""" assert isinstance(size, int) self.size = size self.insert_counter = 0 self.oldest = 0 self._data : Dict[Hashable,Tuple[Any,int]]= {} # store values and age index self._lru: Dict[int, Hashable] = {} # age counter dictionary def insert(self, key: Hashable, value: Any) -> None: """Insert into dictionary""" existing = self._data.get(key, None) self._data[key] = (value, self.insert_counter) self._lru[self.insert_counter] = key if existing is not None: self._lru.pop(existing[1], None) # remove old counter value, if it exists self.insert_counter += 1 if (sz := len(self._data)) > self.size: # is cache full? assert sz == self.size + 1 while ( key := self._lru.get(self.oldest, None)) is None: # index may not be present, if value was reinserted self.oldest += 1 del self._data[key] # remove oldest key / value from dictionary del self._lru[self.oldest] self.oldest += 1 # next oldest index assert len(self._lru) == len(self._data) def get(self, key: Hashable) -> Optional[Any]: """Get value or return None if not in cache""" if (tpl := self._data.get(key, None)) is not None: return tpl[0] return None if __name__ == "__main__": CACHE_SIZE = 1000 TEST_SIZE = 1_000_000 cache = LruCache(size=CACHE_SIZE) all = [] for i in range(TEST_SIZE): all.append(random.randint(-5000, 5000)) summary = collections.defaultdict(int) for value in all: cache.insert(value, value * value) summary[value] += 1 smallest = TEST_SIZE largest = -TEST_SIZE total = 0 for value, count in summary.items(): smallest = min(smallest, count) largest = max(largest, count) total += count avg = total / len(summary) print(f"{len(summary)} values occurrences range from {smallest} to {largest}, average {avg:.1f}") recent = set() # recent most recent entries for i in range(len(all) - 1, -1, -1): # loop backwards to get the most recent entries value = all[i] if len(recent) < CACHE_SIZE: recent.add(value) if value in recent: if (r := cache.get(value)) != value * value: raise ValueError(f"Cache missing recent {value} {r}") else: if cache.get(value) != None: raise ValueError(f"Cache includes old {value}") From: Python-list on behalf of Dino Date: Wednesday, February 15, 2023 at 3:07 PM To: python-list@python.org Subject: Re: LRU cache *** Attention: This is an external email. Use caution responding, opening attachments or clicking on links. *** Thank you Mats, Avi and Chris btw, functools.lru_cache seems rather different from what I need, but maybe I am missing something. I'll look closer. On 2/14/2023 7:36 PM, Mats Wichmann wrote: On 2/14/23 15:07, Dino wrote: -- https://urldefense.com/v3/__https://mail.python.org/mailman/listinfo/python-list__;!!Cn_UX_p3!jb3Gr2BFAPLJ2YuI5rFdJUtalqWcijhxHAfdmCI3afnLFDdcekALxDYAQwpE1L_JlJBBJ-BB3BuLdoSE$<https://urldefense.com/v3/__https:/mail.python.org/mailman/listinfo/python-list__;!!Cn_UX_p3!jb3Gr2BFAPLJ2YuI5rFdJUtalqWcijhxHAfdmCI3afnLFDdcekALxDYAQwpE1L_JlJBBJ-BB3BuLdoSE$> -- https://mail.python.org/mailman/listinfo/python-list
Re: LRU cache
Thank you Mats, Avi and Chris btw, functools.lru_cache seems rather different from what I need, but maybe I am missing something. I'll look closer. On 2/14/2023 7:36 PM, Mats Wichmann wrote: On 2/14/23 15:07, Dino wrote: -- https://mail.python.org/mailman/listinfo/python-list
Re: Comparing caching strategies
On 2/10/2023 7:39 PM, Dino wrote: - How would you structure the caching so that different caching strategies are "pluggable"? change one line of code (or even a config file) and a different caching strategy is used in the next run. Is this the job for a design pattern such as factory or facade? turns out that the strategy pattern was the right one for me. -- https://mail.python.org/mailman/listinfo/python-list
LRU cache
Here's my problem today. I am using a dict() to implement a quick and dirty in-memory cache. I am stopping adding elements when I am reaching 1000 elements (totally arbitrary number), but I would like to have something slightly more sophisticated to free up space for newer and potentially more relevant entries. I am thinking of the Least Recently Used principle, but how to implement that is not immediate. Before I embark on reinventing the wheel, is there a tool, library or smart trick that will allow me to remove elements with LRU logic? thanks Dino -- https://mail.python.org/mailman/listinfo/python-list
Comparing caching strategies
First off, a big shout out to Peter J. Holzer, who mentioned roaring bitmaps a few days ago and led me to quite a discovery. Now I am stuck with an internal dispute with another software architect (well, with a software architect, I should say, as I probably shouldn't define myself a software architect when confronted with people with more experience than me in building more complex systems). Anyway, now that I know what roaring bitmaps are (and what they can do!), my point is that we should abandon other attempts to build a caching layer for our project and just veer decidedly towards relying on those magic bitmaps and screw anything else. Sure, there is some overhead marshaling our entries into integers and back, but the sheer speed and compactness of RBMs trump any other consideration (according to me, not according to the other guy, obviously). Long story short: I want to prototype a couple of caching strategies in Python using bitmaps, and measure both performance and speed. So, here are a few questions from an inexperienced programmer for you, friends. Apologies if they are a bit "open ended". - How would you structure the caching so that different caching strategies are "pluggable"? change one line of code (or even a config file) and a different caching strategy is used in the next run. Is this the job for a design pattern such as factory or facade? - what tool should I use to measure/log performance and memory occupation of my script? Google is coming up with quite a few options, but I value the opinion of people here a lot. Thank you for any feedback you may be able to provide. Dino -- https://mail.python.org/mailman/listinfo/python-list
Re: RE: RE: bool and int
you have your reasons, and I was tempted to stop there, but... I have to pick this... On 1/26/2023 10:09 PM, avi.e.gr...@gmail.com wrote: You can often borrow ideas and code from an online search and hopefully cobble "a" solution together that works well enough. Of course it may suddenly fall apart. also carefully designed systems that are the work of experts may suddenly fall apart. Thank you for all the time you have used to address the points I raised. It was interesting reading. Dino -- https://mail.python.org/mailman/listinfo/python-list
Re: bool and int
On 1/25/2023 5:42 PM, Chris Angelico wrote: Try this (or its equivalent) in as many languages as possible: x = (1 > 2) x == 0 You'll find that x (which has effectively been set to False, or its equivalent in any language) will be equal to zero in a very large number of languages. Thus, to an experienced programmer, it would actually be quite the opposite: having it NOT be a number would be the surprising thing! I thought I had already responded to this, but I can't see it. Weird. Anyway, straight out of the Chrome DevTools console: x = (1>2) false x == 0 true typeof(x) 'boolean' typeof(0) 'number' typeof(x) == 'number' false So, you are technically correct, but you can see that JavaScript - which comes with many gotchas - does not offer this particular one. -- https://mail.python.org/mailman/listinfo/python-list
Re: RE: bool and int
Wow. That was quite a message and an interesting read. Tempted to go deep and say what I agree and what I disagree with, but there are two issues: 1) time 2) I will soon be at a disadvantage discussing with people (you or others) who know more than me (which doesn't make them right necessarily, but certainly they'll have the upper-hand in a discussion). Personally, in the first part of my career I got into the habit of learning things fast, sometimes superficially I confess, and then get stuff done hopefully within time and budget. Not the recommended approach if you need to build software for a nuclear plant. An OK approach (within reason) if you build websites or custom solutions for this or that organization and the budget is what it is. After all, technology moves sooo fast, and what we learn in detail today is bound to be old and possibly useless 5 years down the road. Also, I argue that there is value in having familiarity with lots of different technologies (front-end and back-end) and knowing (or at lease, having a sense) of how they can all be made play together with an appreciation of the different challenges and benefits that each domain offers. Anyway, everything is equivalent to a Turing machine and IA will screw everyone, including programmers, eventually. Thanks again and have a great day Dino On 1/25/2023 9:14 PM, avi.e.gr...@gmail.com wrote: Dino, There is no such things as a "principle of least surprise" or if you insist there is, I can nominate many more such "rules" such as "the principle of get out of my way and let me do what I want!" Computer languages with too many rules are sometimes next to unusable in practical situations. I am neither defending or attacking choices Python or other languages have made. I merely observe and agree to use languages carefully and as documented. -- https://mail.python.org/mailman/listinfo/python-list
Re: HTTP server benchmarking/load testing in Python
On 1/25/2023 4:30 PM, Thomas Passin wrote: On 1/25/2023 3:29 PM, Dino wrote: Great! Don't forget what I said about potential overheating if you hit the server with as many requests as it can handle. Noted. Thank you. -- https://mail.python.org/mailman/listinfo/python-list
Re: HTTP server benchmarking/load testing in Python
On 1/25/2023 3:27 PM, Dino wrote: On 1/25/2023 1:33 PM, orzodk wrote: I have used locust with success in the past. https://locust.io First impression, exactly what I need. Thank you Orzo! the more I learn about Locust and I tinker with it, the more I love it. Thanks again. -- https://mail.python.org/mailman/listinfo/python-list
Re: HTTP server benchmarking/load testing in Python
On 1/25/2023 1:21 PM, Thomas Passin wrote: I actually have a Python program that does exactly this. Thank you, Thomas. I'll check out Locust, mentioned by Orzodk, as it looks like a mature library that appears to do exactly what I was hoping. -- https://mail.python.org/mailman/listinfo/python-list
Re: HTTP server benchmarking/load testing in Python
On 1/25/2023 1:33 PM, orzodk wrote: I have used locust with success in the past. https://locust.io First impression, exactly what I need. Thank you Orzo! -- https://mail.python.org/mailman/listinfo/python-list
Re: bool and int
On 1/23/2023 11:22 PM, Dino wrote: >>> b = True >>> isinstance(b,bool) True >>> isinstance(b,int) True >>> ok, I read everything you guys wrote. Everyone's got their reasons obviously, but allow me to observe that there's also something called "principle of least surprise". In my case, it took me some time to figure out where a nasty bug was hidden. Letting a bool be a int is quite a gotcha, no matter how hard the benevolent dictator tries to convince me otherwise! -- https://mail.python.org/mailman/listinfo/python-list
HTTP server benchmarking/load testing in Python
Hello, I could use something like Apache ab in Python ( https://httpd.apache.org/docs/2.4/programs/ab.html ). The reason why ab doesn't quite cut it for me is that I need to define a pool of HTTP requests and I want the tool to run those (as opposed to running the same request over and over again) Does such a marvel exist? Thinking about it, it doesn't necessarily need to be Python, but I guess I would have a chance to tweak things if it was. Thanks Dino -- https://mail.python.org/mailman/listinfo/python-list
bool and int
$ python Python 3.8.10 (default, Mar 15 2022, 12:22:08) [GCC 9.4.0] on linux Type "help", "copyright", "credits" or "license" for more information. >>> b = True >>> isinstance(b,bool) True >>> isinstance(b,int) True >>> WTF! -- https://mail.python.org/mailman/listinfo/python-list
Re: tree representation of Python data
you rock. Thank you, Stefan. Dino On 1/21/2023 2:41 PM, Stefan Ram wrote: r...@zedat.fu-berlin.de (Stefan Ram) writes: def display_( object, last ): directory = object; result = ''; count = len( directory ) for entry in directory: count -= 1; name = entry; indent = '' for c in last[ 1: ]: indent += '│ ' if c else '' indent += '├──' if count else '└──' if last else '' result += '\n' + indent +( ' ' if indent else '' )+ name if directory[ entry ]: result += display_( directory[ entry ], last +[ count ]) return result This ultimate version has some variable names made more speaking: def display_( directory, container_counts ): result = ''; count = len( directory ) for name in directory: count -= 1; indent = '' for container_count in container_counts[ 1: ]: indent += '│ ' if container_count else '' indent += '├──' if count else '└──' if container_counts else '' result += '\n' + indent +( ' ' if indent else '' )+ name if directory[ name ]: result += display_\ ( directory[ name ], container_counts +[ count ]) return result -- https://mail.python.org/mailman/listinfo/python-list
Re: ok, I feel stupid, but there must be a better way than this! (finding name of unique key in dict)
I learned new things today and I thank you all for your responses. Please consider yourself thanked individually. Dino On 1/20/2023 10:29 AM, Dino wrote: let's say I have this list of nested dicts: -- https://mail.python.org/mailman/listinfo/python-list
tree representation of Python data
I have a question that is a bit of a shot in the dark. I have this nice bash utility installed: $ tree -d unit/ unit/ ├── mocks ├── plugins │ ├── ast │ ├── editor │ ├── editor-autosuggest │ ├── editor-metadata │ ├── json-schema-validator │ │ └── test-documents │ └── validate-semantic │ ├── 2and3 │ ├── bugs │ └── oas3 └── standalone └── topbar-insert I just thought that it would be great if there was a Python utility that visualized a similar graph for nested data structures. Of course I am aware of indent (json.dumps()) and pprint, and they are OK options for my need. It's just that the compact, improved visualization would be nice to have. Not so nice that I would go out of my way to build, but nice enough to use an exising package. Thanks Dino -- https://mail.python.org/mailman/listinfo/python-list
Re: ok, I feel stupid, but there must be a better way than this! (finding name of unique key in dict)
On 1/20/2023 11:06 AM, Tobiah wrote: On 1/20/23 07:29, Dino wrote: This doesn't look like the program output you're getting. you are right that I tweaked the name of fields and variables manually (forgot a couple of places, my bad) to illustrate the problem more generally, but hopefully you get the spirit. "value": cn, "a": cd[cn]["a"], "b": cd[cn]["b"] Anyway, the key point (ooops, a pun) is if there's a more elegant way to do this (i.e. get a reference to the unique key in a dict() when the key is unknown): cn = list(cd.keys())[0] # There must be a better way than this! Thanks -- https://mail.python.org/mailman/listinfo/python-list
ok, I feel stupid, but there must be a better way than this! (finding name of unique key in dict)
let's say I have this list of nested dicts: [ { "some_key": {'a':1, 'b':2}}, { "some_other_key": {'a':3, 'b':4}} ] I need to turn this into: [ { "value": "some_key", 'a':1, 'b':2}, { "value": "some_other_key", 'a':3, 'b':4} ] I actually did it with: listOfDescriptors = list() for cd in origListOfDescriptors: cn = list(cd.keys())[0] # There must be a better way than this! listOfDescriptors.append({ "value": cn, "type": cd[cn]["a"], "description": cd[cn]["b"] }) and it works, but I look at this and think that there must be a better way. Am I missing something obvious? PS: Screw OpenAPI! Dino -- https://mail.python.org/mailman/listinfo/python-list
Re: Fast lookup of bulky "table"
Thanks a lot, Edmondo. Or better... Grazie mille. On 1/17/2023 5:42 AM, Edmondo Giovannozzi wrote: Sorry, I was just creating an array of 400x10 elements that I fill with random numbers: a = np.random.randn(400,100_000) Then I pick one element randomly, it is just a stupid sort on a row and then I take an element in another row, but it doesn't matter, I'm just taking a random element. I may have used other ways to get that but was the first that came to my mind. ia = np.argsort(a[0,:]) a_elem = a[56, ia[0]] The I'm finding that element in the all the matrix a (of course I know where it is, but I want to test the speed of a linear search done on the C level): %timeit isel = a == a_elem Actually isel is a logic array that is True where a[i,j] == a_elem and False where a[i,j] != a_elem. It may find more then one element but, of course, in our case it will find only the element that we have selected at the beginning. So it will give the speed of a linear search plus the time needed to allocate the logic array. The search is on the all matrix of 40 million of elements not just on one of its row of 100k element. On the single row (that I should say I have chosen to be contiguous) is much faster. %timeit isel = a[56,:] == a_elem 26 µs ± 588 ns per loop (mean ± std. dev. of 7 runs, 1 loops each) the matrix is a double precision numbers that is 8 byte, I haven't tested it on string of characters. This wanted to be an estimate of the speed that one can get going to the C level. You loose of course the possibility to have a relational database, you need to have everything in memory, etc... A package that implements tables based on numpy is pandas: https://pandas.pydata.org/ I hope that it can be useful. -- https://mail.python.org/mailman/listinfo/python-list
Re: Fast lookup of bulky "table"
On 1/16/2023 1:18 PM, Edmondo Giovannozzi wrote: As a comparison with numpy. Given the following lines: import numpy as np a = np.random.randn(400,100_000) ia = np.argsort(a[0,:]) a_elem = a[56, ia[0]] I have just taken an element randomly in a numeric table of 400x10 elements To find it with numpy: %timeit isel = a == a_elem 35.5 ms ± 2.79 ms per loop (mean ± std. dev. of 7 runs, 10 loops each) And %timeit a[isel] 9.18 ms ± 371 µs per loop (mean ± std. dev. of 7 runs, 100 loops each) As data are not ordered it is searching it one by one but at C level. Of course it depends on a lot of thing... thank you for this. It's probably my lack of experience with Numpy, but... can you explain what is going on here in more detail? Thank you Dino -- https://mail.python.org/mailman/listinfo/python-list
Re: Fast lookup of bulky "table"
On 1/16/2023 2:53 AM, David wrote: See here: https://docs.python.org/3/reference/expressions.html#assignment-expressions https://realpython.com/python-walrus-operator/ Thank you, brother. -- https://mail.python.org/mailman/listinfo/python-list
Re: Fast lookup of bulky "table"
Just wanted to take a moment to express my gratitude to everyone who responded here. You have all been so incredibly helpful. Thank you Dino On 1/14/2023 11:26 PM, Dino wrote: Hello, I have built a PoC service in Python Flask for my work, and - now that the point is made - I need to make it a little more performant (to be honest, chances are that someone else will pick up from where I left off, and implement the same service from scratch in a different language (GoLang? .Net? Java?) but I am digressing). -- https://mail.python.org/mailman/listinfo/python-list
Re: Fast lookup of bulky "table"
On 1/15/2023 2:23 PM, Weatherby,Gerard wrote: That’s about what I got using a Python dictionary on random data on a high memory machine. https://github.com/Gerardwx/database_testing.git It’s not obvious to me how to get it much faster than that. Gerard, you are a rockstar. This is going to be really useful if I do decide to adopt sqlite3 for my PoC, as I understand what's going on conceptually, but never really used sqlite (nor SQL in a long long time), so this may save me a bunch of time. I created a 300 Mb DB using your script. Then: $ ./readone.py testing 2654792 of 4655974 Found somedata0002654713 for 1ed9f9cd-0a9e-47e3-b0a7-3e1fcdabe166 in 0.23933520219 seconds $ ./prefetch.py Index build 4.42093784897 seconds testing 3058568 of 4655974 Found somedata202200 for 5dca1455-9cd6-4e4d-8e5a-7e6400de7ca7 in 4.443999403715e-06 seconds So, if I understand right: 1) once I built a dict out of the DB (in about 4 seconds), I was able to lookup an entry/record in 4 microseconds(!) 2) looking up a record/entry using a Sqlite query took 0.2 seconds (i.e. 500x slower) Interesting. Thank you for this. Very informative. I really appreciate that you took the time to write this. The conclusion seems to me that I probably don't want to go the Sqlite route, as I would be placing my data into a database just to extract it back into a dict when I need it if I want it fast. Ps: a few minor fixes to the README as this may be helpful to others. ./venv/... => ./env/.. i.e. ./env/bin/pip install -U pip ./env/bin/pip install -e . Also add part in [] Run create.py [size of DB in bytes] prior to running readone.py and/or prefetch.py BTW, can you tell me what is going on here? what's := ? while (increase := add_some(conn,adding)) == 0: https://github.com/Gerardwx/database_testing/blob/main/src/database_testing/create.py#L40 Dino -- https://mail.python.org/mailman/listinfo/python-list
Re: Fast lookup of bulky "table"
Thank you, Peter. Yes, setting up my own indexes is more or less the idea of the modular cache that I was considering. Seeing others think in the same direction makes it look more viable. About Scalene, thank you for the pointer. I'll do some research. Do you have any idea about the speed of a SELECT query against a 100k rows / 300 Mb Sqlite db? Dino On 1/15/2023 6:14 AM, Peter J. Holzer wrote: On 2023-01-14 23:26:27 -0500, Dino wrote: Hello, I have built a PoC service in Python Flask for my work, and - now that the point is made - I need to make it a little more performant (to be honest, chances are that someone else will pick up from where I left off, and implement the same service from scratch in a different language (GoLang? .Net? Java?) but I am digressing). Anyway, my Flask service initializes by loading a big "table" of 100k rows and 40 columns or so (memory footprint: order of 300 Mb) 300 MB is large enough that you should at least consider putting that into a database (Sqlite is probably simplest. Personally I would go with PostgreSQL because I'm most familiar with it and Sqlite is a bit of an outlier). The main reason for putting it into a database is the ability to use indexes, so you don't have to scan all 100 k rows for each query. You may be able to do that for your Python data structures, too: Can you set up dicts which map to subsets you need often? There are some specialized in-memory bitmap implementations which can be used for filtering. I've used [Judy bitmaps](https://judy.sourceforge.net/doc/Judy1_3x.htm) in the past (mostly in Perl). These days [Roaring Bitmaps](https://www.roaringbitmap.org/) is probably the most popular. I see several packages on PyPI - but I haven't used any of them yet, so no recommendation from me. Numpy might also help. You will still have linear scans, but it is more compact and many of the searches can probably be done in C and not in Python. As you can imagine, this is not very performant in its current form, but performance was not the point of the PoC - at least initially. For performanc optimization it is very important to actually measure performance, and a good profiler helps very much in identifying hot spots. Unfortunately until recently Python was a bit deficient in this area, but [Scalene](https://pypi.org/project/scalene/) looks promising. hp -- https://mail.python.org/mailman/listinfo/python-list
Re: Fast lookup of bulky "table"
Thank you for your answer, Lars. Just a clarification: I am already doing a rough measuring of my queries. A fresh query without any caching: < 4s. Cached full query: < 5 micro-s (i.e. 6 orders of magnitude faster) Desired speed for my POC: 10 Also, I didn't want to ask a question with way too many "moving parts", but when I talked about the "table", it's actually a 100k long list of IDs. I can then use each ID to invoke an API that will return those 40 attributes. The API is fast, but still, I am bound to loop through the whole thing to respond to the query, that's unless I pre-load the data into something that allows faster access. Also, as you correctly observed, "looking good with my colleagues" is a nice-to-have feature at this point, not really an absolute requirement :) Dino On 1/15/2023 3:17 AM, Lars Liedtke wrote: Hey, before you start optimizing. I would suggest, that you measure response times and query times, data search times and so on. In order to save time, you have to know where you "loose" time. Does your service really have to load the whole table at once? Yes that might lead to quicker response times on requests, but databases are often very good with caching themselves, so that the first request might be slower than following requests, with similar parameters. Do you use a database, or are you reading from a file? Are you maybe looping through your whole dataset on every request? Instead of asking for the specific data? Before you start introducing a cache and its added complexity, do you really need that cache? You are talking about saving microseconds, that sounds a bit as if you might be “overdoing” it. How many requests will you have in the future? At least in which magnitude and how quick do they have to be? You write about 1-4 seconds on your laptop. But that does not really tell you that much, because most probably the service will run on a server. I am not saying that you should get a server or a cloud-instance to test against, but to talk with your architect about that. I totally understand your impulse to appear as good as can be, but you have to know where you really need to debug and optimize. It will not be advantageous for you, if you start to optimize for optimizing's sake. Additionally if you service is a PoC, optimizing now might be not the first thing you have to worry about, but about that you made everything as simple and readable as possible and that you do not spend too much time for just showing how it could work. But of course, I do not know the tasks given to you and the expectations you have to fulfil. All I am trying to say is to reconsider where you really could improve and how far you have to improve. -- https://mail.python.org/mailman/listinfo/python-list
Fast lookup of bulky "table"
Hello, I have built a PoC service in Python Flask for my work, and - now that the point is made - I need to make it a little more performant (to be honest, chances are that someone else will pick up from where I left off, and implement the same service from scratch in a different language (GoLang? .Net? Java?) but I am digressing). Anyway, my Flask service initializes by loading a big "table" of 100k rows and 40 columns or so (memory footprint: order of 300 Mb) and then accepts queries through a REST endpoint. Columns are strings, enums, and numbers. Once initialized, the table is read only. The endpoint will parse the query and match it against column values (equality, inequality, greater than, etc.) Finally, it will return a (JSON) list of all rows that satisfy all conditions in the query. As you can imagine, this is not very performant in its current form, but performance was not the point of the PoC - at least initially. Before I deliver the PoC to a more experienced software architect who will look at my code, though, I wouldn't mind to look a bit less lame and do something about performance in my own code first, possibly by bringing the average time for queries down from where it is now (order of 1 to 4 seconds per query on my laptop) to 1 or 2 milliseconds on average). To be honest, I was already able to bring the time down to a handful of microseconds thanks to a rudimentary cache that will associate the "signature" of a query to its result, and serve it the next time the same query is received, but this may not be good enough: 1) queries might be many and very different from one another each time, AND 2) I am not sure the server will have a ton of RAM if/when this thing - or whatever is derived from it - is placed into production. How can I make my queries generally more performant, ideally also in case of a new query? Here's what I have been considering: 1. making my cache more "modular", i.e. cache the result of certain (wide) queries. When a complex query comes in, I may be able to restrict my search to a subset of the rows (as determined by a previously cached partial query). This should keep the memory footprint under control. 2. Load my data into a numpy.array and use numpy.array operations to slice and dice my data. 3. load my data into sqlite3 and use SELECT statement to query my table. I have never used sqllite, plus there's some extra complexity as comparing certain colum requires custom logic, but I wonder if this architecture would work well also when dealing with a 300Mb database. 4. Other ideas? Hopefully I made sense. Thank you for your attention Dino -- https://mail.python.org/mailman/listinfo/python-list
RE: [IronPython] IronPython 2.7 Now Available
The PTVS release is really an extended version of the tools in IronPython 2.7. It adds support for CPython including debugging, profiling, etc... while still supporting IronPython as well. We'll likely either replace the tools distributed w/ IronPython with this version (maybe minus things like HPC support) or we'll pull the IpyTools out of the distribution and encourage people to go for the separate download. No changes will likely happen until IronPython 3.x though as 2.7 is now out the door and it'd be a pretty significant change. For the time being you'll need to choose one or the other - you can always choose to not by either not installing the IpyTools w/ the IronPython install and install the PTVS or you can just stick w/ the existing IronPython tools. > -Original Message- > From: users-boun...@lists.ironpython.com [mailto:users- > boun...@lists.ironpython.com] On Behalf Of Medcoff, Charles > Sent: Sunday, March 13, 2011 2:15 PM > To: Discussion of IronPython; python-list > Subject: Re: [IronPython] IronPython 2.7 Now Available > > Can someone on the list clarify differences or overlap between the tools > included in this release, and the PTVS release? > ___ > Users mailing list > us...@lists.ironpython.com > http://lists.ironpython.com/listinfo.cgi/users-ironpython.com -- http://mail.python.org/mailman/listinfo/python-list
RE: Python Tools for Visual Studio from Microsoft - Free & Open Source
Patty wrote: > Thanks so much for this reference - and the detailed further explanation! I > have a Windows 7 system and recently installed Visual Studio 2010 for the > SQL Server, Visual C/C++ and Visual Basic. I would love to have this Python > tool installed under Visual Studio but a few questions: 1) I have regular > Python installed not Cpython or Jpython or any other variant (have both 2.6 > and 3.2 versions) so would that be a problem and it won't install or won't > work? 2) I saw that this was a beta, would there be an automatic notification > that there are upgrades (I mean within the software itself) or would it be > advisable for me to wait until it goes final because I am relatively newer to > Python and maybe shouldn't be mucking with a beta of > something 3) there is a message bar at the top right corner of the web > page that a certain number of people are 'following this project' Is that > where you would rely on for upgrades notifications or what exactly would > you be following as far as a 'project' of this type? CPython is actually regular Python - the C just clarifies that it's the implementation written in C (vs. C#, Java, or Python). There won't be any notification of updates via the software it's self but given that you heard about the 1st release within days of it coming out my guess is you'll hear about the updates as well. I'm not actually certain if following a project on CodePlex will give you e-mail notifications or not. I typically subscribe to CodePlex's RSS feed for projects I'm implemented in - for example this feed http://pytools.codeplex.com/project/feeds/rss includes all changes to the project. There's other feeds below the RSS button which track just new releases or other things. -- http://mail.python.org/mailman/listinfo/python-list
RE: Packages at Python.org
Kirby wrote: > ** Unconfirmed rumors about IronPython leave me blog searching this > afternoon. Still part of Codeplex? IronPython is still using CodePlex for bug tracking and posting releases but active development is now on GitHub w/ a Mercurial mirror. Jeff's blog has more info: http://jdhardy.blogspot.com/ -- http://mail.python.org/mailman/listinfo/python-list
RE: Why Python3
Terry wrote: > > IronPython targets Python 2.6. > > They plan to release a 2.7 version sometime this year after CPython2.7 > is released. They plan to release a 3.2 version early next year, soon > after CPython. They should be able to do that because they already have > a 3.1 version mostly done (but will not release it as such) and 3.2 has > no new syntax, so the 3.1 development version will easily morph into a > 3.2 release version. I forget just where I read this, but here is a > public article. > http://www.itworld.com/development/104506/python-3-and-ironpython > Cameron Laird, Python/IronPython developer ''' > As Jimmy Schementi, a Program Manager with Microsoft, e-mailed me last > week, "IronPython's roadmap over the next year includes compatibility > with Python 3. Also, we're planning on a release ... before our first > 3.2-compatible release which will target 2.7 compatibility." Close but not 100% correct - we do plan to release 2.7 sometime this year but 3.2 is going to be sometime next year, not early, I would guess EOY. I guess Jimmy misspoke a little there but the "2.7 this year 3.2 next year" plan is what I said during my PyCon State of IronPython talk and it hasn't changed yet. Also we have only a few 3.x features implemented (enabled w/ a -X:Python30 option since 2.6) instead of having a different build for 3.x. Running with that option isn't likely to run any real 3.x code though but it gives people a chance to test out a few new features. Of course implementing 2.7 also gets us much closer to 3.x then we are today w/ all its backports so we are certainly making progress. -- http://mail.python.org/mailman/listinfo/python-list
ssl, v23 client, v3 server...
In the ssl module docs (and in the tests) it says that if you have a client specifying PROTOCOL_SSLv23 (so it'll use v2 or v3) and a server specifying PROTOCOL_SSLv3 (so it'll only use v3) that you cannot connect between the two. Why doesn't this end up using SSL v3 for the communication? -- http://mail.python.org/mailman/listinfo/python-list
RE: Modifying Class Object
Steve wrote: > id() simply returns a unique value identifying a particular object. In > CPython, where objects do not migrate in memory once created, the > memory > address of the object is used. In IronPython each object is assigned an > id when it is created, and that value is stored as an attribute. Just a point of clarification: In IronPython ids are lazily assigned upon a call to the id(). They're actually fairly expensive to create because the ids need to be maintained by a dictionary which uses weak references. > >> If you disagree, please write (in any implementation you like: it need > >> not even be portable, though I can't imagine why ti wouldn't be) a > >> Python function which takes an id() value as its argument and > >> returns the value for which the id() value was provided. Just for fun this works in IronPython 2.6: >>> import clr >>> clr.AddReference('Microsoft.Dynamic') >>> from Microsoft.Scripting.Runtime import IdDispenser >>> x = object() >>> id(x) 43 >>> IdDispenser.GetObject(43) >>> IdDispenser.GetObject(43) is x True Please, please, no one ever use this code! I do generally agree with the sentiment that id is object identity and in way related to pointers though. -- http://mail.python.org/mailman/listinfo/python-list
RE: myths about python 3
Stefan wrote: > >From an implementors point of view, it's actually quite the opposite. Most > syntax features of Python 3 can be easily implemented on top of an existing > Py2 Implementation (we have most of them in Cython already, and I really > found them fun to write), and the shifting-around in the standard library > can hardly be called non-trivial. All the hard work that went into the > design of CPython 3.x (and into its test suite) now makes it easy to just > steal from what's there already. > > The amount of work that the Jython project put into catching up from 2.1 to > 2.5/6 (new style classes! generators!) is really humongous compared to the > adaptations that an implementation needs to do to support Python 3 code. I > have great respect for the Jython project for what they achieved in the > last couple of years. (I also have great respect for the IronPython project > for fighting the One Microsoft Way into opening up, but that's a different > kind of business.) > > If there was enough interest from the respective core developers, I > wouldn't be surprised if we had more than one 'mostly compatible' > alternative Python 3 implementation in a couple of months. But it's the > obvious vicious circle business. As long as there aren't enough important > users of Py3, alternative implementations won't have enough incentives to > refocus their scarce developer time. Going for 2.6/7 first means that most > of the Py3 work gets done anyway, so it'll be even easier then. That makes > 2.6->2.7->3.2/3 the most natural implementation path. (And that, again, > makes it a *really* good decision that 2.7 will be the last 2.x release line.) I just want to echo this as I completely agree. Last time I went through the list it looked like there were around 10 major new features (some of them even not so major) that we needed to implement to bring IronPython up to the 3.0 level. It shouldn't be too time consuming, and it greatly improves our compatibility by finally having the same string types, but our users don't yet want us to stop supporting 2.x. -- http://mail.python.org/mailman/listinfo/python-list
RE: Ironpython experience
Lev wrote: > I'm an on and off Python developer and use it as one of the tools. > Never for writing "full-blown" applications, but rather small, "one-of- > a-kind" utilities. This time I needed some sort of backup and > reporting utility, which is to be used by the members of our team > once or twice a day. Execution time is supposed be negligible. The > project was an ideal candidate to be implemented in Python. As > expected the whole script was about 200 lines and was ready in a 2 > hours (the power of Python!).Then I downloaded Ironpython and > relatively painlessly (except the absence of zlib) converted the > Python code to Ironpython. Works fine and Ironython really is Python. > But... > > The CPython 2.6 script runs 0.1 seconds, while Ironpython 2.6 runs > about 10 seconds. The difference comes from the start-up, when all > these numerous dlls/assemblies are loaded and JITed. > > Is there any way to speed up the process. Can you give us more information about the environment you're running in? E.g. how did you install IronPython, is this on 32-bit or 64-bit and are you using ipy.exe or ipy64.exe? The sweet spot to be in is on a 32-bit machine or a 64-bit machine and using ipy.exe. You should also be using ngen'd (pre-compiled) binaries which the MSI does for you. Combining 32-bit plus ngen should greatly reduce startup time and typically on our test machines it only takes a couple of seconds (http://ironpython.codeplex.com/wikipage?title=IP26FinalVsCPy26Perf&referringTitle=IronPython%20Performance). That's still a lot worse than CPython startup time but it's much better than 10 seconds. We also continue to work on startup time - there's already some big improvements in our Main branch which should be showing up in 2.6.1. Matching CPython is still a long ways off if we ever can do it but do intend to keep on pushing on it. -- http://mail.python.org/mailman/listinfo/python-list
RE: [Python-Dev] PEP 384: Defining a Stable ABI
Dirkjan Ochtman wrote: > > It would seem to me that optimizations are likely to require data > structure changes, for exactly the kind of core data structures that > you're talking about locking down. But that's just a high-level view, > I might be wrong. > In particular I would guess that ref counting is the biggest issue here. I would think not directly exposing the field and having inc/dec ref Functions (real methods, not macros) for it would give a lot more ability to change the API in the future. It also might make it easier for alternate implementations to support the same API so some modules could work cross implementation - but I suspect that's a non-goal of this PEP :). Other fields directly accessed (via macros or otherwise) might have similar problems but they don't seem as core as ref counting. -- http://mail.python.org/mailman/listinfo/python-list
RE: interpreter vs. compiled
It looks like the pickle differences are due to two issues. First IronPython doesn't have ASCII strings so it serializes strings as Unicode. Second there are dictionary ordering differences. If you just do: { 'a': True, 'b': set( ) } Cpy prints: {'a': True, 'b': set([])} Ipy prints: {'b': set([]), 'a': True} The important thing is that we interop - and indeed you can send either pickle string to either implementation and the correct results are deserialized (modulo getting Unicode strings). For your more elaborate example you're right that there could be a problem here. But the DLR actually recognizes this sort of pattern and optimizes for it. All of the additions in your code are what I've been calling serially monomorphic call sites. That is they see the same types for a while, maybe even just once as in your example, and then they switch to a new type - never to return to the old one. When IronPython gives the DLR the code for the call site the DLR can detect when the code only differs by constants - in this case type version checks. It will then re-write the code turning the changing constants into variables. The next time through when it sees the same code again it'll re-use the existing compiled code with the new sets of constants. That's still slower than we were in 1.x so we'll need to push on this more in the future - for example producing a general rule instead of a type-specific rule. But for the time being having the DLR automatically handle this has been working good enough for these situations. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of castironpi Sent: Tuesday, July 29, 2008 11:40 PM To: python-list@python.org Subject: Re: interpreter vs. compiled I note that IronPython and Python's pickle.dumps do not return the same value. Perhaps this relates to the absence of interpreter loop. >>> p.dumps( { 'a': True, 'b': set( ) } ) IPy: '(dp0\nVb\np1\nc__builtin__\nset\np3\n((lp4\ntp5\nRp2\nsVa \np6\nI01\ns.' CPy: "(dp0\nS'a'\np1\nI01\nsS'b'\np2\nc__builtin__\nset \np3\n((lp4\ntp5\nRp6\ns." You make me think of a more elaborate example. for k in range( 100 ): i= j() g= h+ i e= f+ g c= d+ e a= b+ c Here, j creates a new class dynamically, and returns an instance of it. Addition is defined on it but the return type from it varies. If I read you correctly, IPy can leave hundreds of different addition stubs laying around at the end of the for-loop, each of which only gets executed once or twice, each of which was compiled for the exact combination of types it was called for. I might construe this to be a degenerate case, and the majority of times, you'll reexecute stubs enough to outweigh the length of time the compilation step takes. If you still do the bounds checking, it takes extra instructions (C doesn't), but operation switch-case BINARY_ADD, (PyInt_CheckExact(v) && PyInt_CheckExact(w)), and POP and TOP, are all handled by the selection of stubs from $addSite. I'm read from last April: >>> The most interesting cases to me are the 5 tests where CPython is more than >>> 3x faster than IronPython and the other 5 tests where IronPython is more >>> than 3x faster than CPython. CPython's strongest performance is in >>> dictionaries with integer and string keys, list slicing, small tuples and >>> code that actually throws and catches exceptions. IronPython's strongest >>> performance is in calling builtin functions, if/then/else blocks, calling >>> python functions, deep recursion, and try/except blocks that don't actually >>> catch an exception. <<< http://lists.ironpython.com/pipermail/users-ironpython.com/2007-April/004773.html It's interesting that CPython can make those gains still by using a stack implementation. I'll observe that IronPython has the additional dependency of the full .NET runtime. (It was my point 7/18 about incorporating the GNU libs, that to compile to machine-native, as a JIT does, you need the instruction set of the machine.) Whereas, CPython can disregard them, having already been compiled for it. I think what I was looking for is that IronPython employs the .NET to compile to machine instructions, once it's known what the values of the variables are that are the operands. The trade-off is compilation time + type checks + stub look-up. What I want to know is, if __add__ performs an attribute look-up, is that optimized in any way, after the IP is already in compiled code? After all that, I don't feel so guilty about stepping on Tim's toes. On Jul 30, 12:12 am, Dino Viehland <[EMAIL PROTECTED]> wrote: > IronPython doesn't have an interpreter loop
RE: interpreter vs. compiled
IronPython doesn't have an interpreter loop and therefore has no POP / TOP / etc... Instead what IronPython has is a method call Int32Ops.Add which looks like: public static object Add(Int32 x, Int32 y) { long result = (long) x + y; if (Int32.MinValue <= result && result <= Int32.MaxValue) { return Microsoft.Scripting.Runtime.RuntimeHelpers.Int32ToObject((Int32)(result)); } return BigIntegerOps.Add((BigInteger)x, (BigInteger)y); } This is the implementation of int.__add__. Note that calling int.__add__ can actually return NotImplemented and that's handled by the method binder looking at the strong typing defined on Add's signature here - and then automatically generating the NotImplemented result when the arguments aren't ints. So that's why you don't see that here even though it's the full implementation of int.__add__. Ok, next if you define a function like: def adder(a, b): return a + b this turns into a .NET method, which will get JITed, which in C# would look something like like: static object adder(object a, object b) { return $addSite.Invoke(a, b) } where $addSite is a dynamically updated call site. $addSite knows that it's performing addition and knows how to do nothing other than update the call site the 1st time it's invoked. $addSite is local to the function so if you define another function doing addition it'll have its own site instance. So the 1st thing the call site does is a call back into the IronPython runtime which starts looking at a & b to figure out what to do. Python defines that as try __add__, maybe try __radd__, handle coercion, etc... So we go looking through finding the __add__ method - if that can return NotImplemented then we find the __radd__ method, etc... In this case we're just adding two integers and we know that the implementation of Add() won't return NotImplemented - so there's no need to call __radd__. We know we don't have to worry about NotImplemented because the Add method doesn't have the .NET attribute indicating it can return NotImplemented. At this point we need to do two things. We need to generate the test which is going to see if future arguments are applicable to what we just figured out and then we need to generate the code which is actually going to handle this. That gets combined together into the new call site delegate and it'll look something like: static void CallSiteStub(CallSite site, object a, object b) { if (a != null && a.GetType() == typeof(int) && b != null && b.GetType() == typeof(int)) { return IntOps.Add((int)a, (int)b); } return site.UpdateBindingAndInvoke(a, b); } That gets compiled down as a lightweight dynamic method which also gets JITed. The next time through the call site's Invoke body will be this method and things will go really fast if we have int's again. Also notice this is looking an awful lot like the inlined/fast-path(?) code dealing with int's that you quoted. If everything was awesome (currently it's not for a couple of reasons) the JIT would even inline the IntOps.Add call and it'd probably be near identical. And everything would be running native on the CPU. So that's how 2 + 2 works... Finally if it's a user type then we'd generate a more complicated test like (and getting more and more pseudo code to keep things simple): if (PythonOps.CheckTypeVersion(a, 42) && PythonOps.CheckTypeVersion(b, 42)) { return $callSite.Invoke(__cachedAddSlot__.__get__(a), b); } Here $callSite is another stub which will handle doing optimal dispatch to whatever __add__.__get__ will return. It could be a Python type, it could be a user defined function, it could be the Python built-in sum function, etc... so that's the reason for the extra dynamic dispatch. So in summary: everything is compiled to IL. At runtime we have lots of stubs all over the place which do the work to figure out the dynamic operation and then cache the result of that calculation. Also what I've just described is how IronPython 2.0 works. IronPython 1.0 is basically the same but mostly w/o the stubs and where we use stub methods they're much less sophisticated. Also, IronPython is open source - www.codeplex.com/IronPython -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of castironpi Sent: Tuesday, July 29, 2008 9:20 PM To: python-list@python.org Subject: Re: interpreter vs. compiled On Jul 29, 7:39 am, alex23 <[EMAIL PROTECTED]> wrote: > On Jul 29, 2:21 pm, castironpi <[EMAIL PROTECTED]> wrote: > > > On Jul 28, 5:58 pm, Fuzzyman <[EMAIL PROTECTED]> wrote: > > > Well - in IronPython user code gets compiled to in memory assemblies > > > which can be JIT'ed. > > > I don't believe so. > > Uh, you're questioning someone who is not only co-author of a book on > IronPython, but also a developer on one of the first IronPython-based > c
RE: Questions on 64 bit versions of Python
The end result of that is on a 32-bit machine IronPython runs in a 32-bit process and on a 64-bit machine it runs in a 64-bit process. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Mike Driscoll Sent: Friday, July 25, 2008 5:58 AM To: python-list@python.org Subject: Re: Questions on 64 bit versions of Python On Jul 25, 5:52 am, Fredrik Lundh <[EMAIL PROTECTED]> wrote: > M.-A. Lemburg wrote: > >> 4. Is there a stable version of IronPython compiled under a 64 bit > >> version of .NET? Anyone have experience with such a beast? > > > Can't comment on that one. > > Should that matter? Isn't IronPython pure CLR? > > IronPython is written in C# and runs in/with the CLR, if that's what you mean. Well, IronPython one works with the CLR and is equivalent to Python 2.4, whereas IronPython 2 works with the DLR and is equivalent to Python 2.5 Mike -- http://mail.python.org/mailman/listinfo/python-list -- http://mail.python.org/mailman/listinfo/python-list
trinity school defender
u gorenavedenom flajeru u 8. redu: "postoji više od 60.000 virusa i drugih štetnih programa " samo virusa ima nekoliko stotina tisuca, zajedno sa potencijalno stetim aplikacijama i ostalim malicioznim kodom brojka ide preko milion -- http://mail.python.org/mailman/listinfo/python-list
RE: Is there a way to use .NET DLL from Python
>> >> Oh, I know what you mean. >> But that was exactly the reason for having a .DLLs folder, isn't it? >> When you place an assembly into this folder, you avoid having to write >> this boilerplate code, and simply import the assembly as you would >> with a normal python module. At least, that´s how it worked in >> previous versions... >No. You have always had to add references to assemblies before being >able to use the namespaces they contain. You even have to do this with >C# in Visual Studio. This *should* work in both IronPython 1.x and IronPyton 2.0 - the catch though is that it's implemented in the default site.py we ship with. So if you do the usual thing and use CPython's Lib directory you'll lose this feature w/o copying it over. -- http://mail.python.org/mailman/listinfo/python-list
RE: Can IronPython work as Windows Scripting Host (WSH) language?
Currently IronPython doesn't support being hosted in WSH. It's something we've discussed internally in the past but we've never had the cycles to make it work. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of misiek3d Sent: Thursday, June 28, 2007 3:07 AM To: python-list@python.org Subject: Can IronPython work as Windows Scripting Host (WSH) language? Hello I want to use IronPython as Windows Scripting Host language. Is it possible? How can I do it? I know that ActivePython works as WSH language but for specific reasons I need to use IronPython. regards Michal -- http://mail.python.org/mailman/listinfo/python-list -- http://mail.python.org/mailman/listinfo/python-list
RE: ironpython exception line number
Given a file foo.py: def f(): You should get these results: IronPython 1.0.60816 on .NET 2.0.50727.312 Copyright (c) Microsoft Corporation. All rights reserved. >>> try: ... execfile('foo.py') ... except IndentationError, e: ... import sys ... x = sys.exc_info() ... >>> print x[1].filename, x[1].lineno, x[1].msg, x[1].offset, x[1].text, >>> x[1].args foo.py 2 unexpected token 1 ('unexpected token ', ('foo.py', 2, 1, '')) >>> >>> Which is very similar to the result you get from CPython although we seem to disagree about what we expect next. Python 2.5 (r25:51908, Sep 19 2006, 09:52:17) [MSC v.1310 32 bit (Intel)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> try: ... execfile('foo.py') ... except IndentationError, e: ... import sys ... x = sys.exc_info() ... >>> print x[1].filename, x[1].lineno, x[1].msg, x[1].offset, x[1].text, >>> x[1].args foo.py 2 expected an indented block 9 ('expected an indented block', ('foo.py', 2, 9, '')) >>> ^Z If you're hosting IronPython and catching this from a .NET language then you'll be catching the .NET exception. In that case you can access the original Python exception from ex.Data["PythonExceptionInfo"]. Alternately you could catch PythonSyntaxErrorException and access its properties (Line, Column, FileName, LineText, Severity, and ErrorCode). -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Troels Thomsen Sent: Tuesday, June 26, 2007 1:33 PM To: python-list@python.org Subject: ironpython exception line number Hello , When an exeption occurs in a IronPython executet script, and I print the sys.exc , i get something ugly like the example below. How can I get the fileName and line number? Thx in advance Troels 26-06-2007 13:19:04 : IronPython.Runtime.Exceptions.PythonIndentationError: unexpected token def ved IronPython.Compiler.SimpleParserSink.AddError(String path, String message, String lineText, CodeSpan span, Int32 errorCode, Severity severity) ved IronPython.Compiler.CompilerContext.AddError(String message, String lineText, Int32 startLine, Int32 startColumn, Int32 endLine, Int32 endColumn, Int32 errorCode, Severity severity) ved IronPython.Compiler.Parser.ReportSyntaxError(Location start, Location end, String message, Int32 errorCode) ved IronPython.Compiler.Parser.ReportSyntaxError(Token t, Int32 errorCode, Boolean allowIncomplete) ved IronPython.Compiler.Parser.ParseSuite() ved IronPython.Compiler.Parser.ParseFuncDef() ved IronPython.Compiler.Parser.ParseStmt() ved IronPython.Compiler.Parser.ParseSuite() ved IronPython.Compiler.Parser.ParseClassDef() ved IronPython.Compiler.Parser.ParseStmt() ved IronPython.Compiler.Parser.ParseFileInput() ved IronPython.Hosting.PythonEngine.Compile(Parser p, Boolean debuggingPossible) ved IronPython.Hosting.PythonEngine.CompileFile(String fileName) ved IronPython.Hosting.PythonEngine.ExecuteFile(String fileName) -- http://mail.python.org/mailman/listinfo/python-list -- http://mail.python.org/mailman/listinfo/python-list
RE: IronPython 1.0 - Bugs or Features?
Yes, IronPython generates IL which the JIT will then compile when the method is invoked - so our parse/compile time is slower due to this. We've experimented w/ a fully interpreted mode (which can be enabled with -X:FastEval) where we walk the generated AST instead of compiling it, but that mode doesn't necessarily pass all the tests (and would get worse performance for long running code). There are other issues w/ startup time as well besides this though that we need to fix (for example we load all the types in mscorlib & System before we drop you into the interpreter, which is a lot of types to be loading...). I suspect that for a small code snippet it's issues like these that are the most noticeable. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Super Spinner Sent: Wednesday, September 06, 2006 4:03 PM To: python-list@python.org Subject: Re: IronPython 1.0 - Bugs or Features? Claudio Grondi wrote: > tjreedy wrote: > > "Claudio Grondi" <[EMAIL PROTECTED]> wrote in message > > news:[EMAIL PROTECTED] > > > >>I also erroneously assumed, that the first problem was detected > >>during parsing ... so, by the way: how can I distinguish an error > >>raised while parsing the code and an error raised when actually running the > >>code? > > > > > > Parsing detects and reports syntax errors and maybe something else > > if you use non-ascii chars without matching coding cookie. Other > > errors are runtime. > Let's consider >print '"Data ê"' > > In CPython 2.4.2 there is in case of non-ascii character: >sys:1: DeprecationWarning: Non-ASCII character '\xea' in file > C:\IronPython-1.0-BugsOrFeatures.py on line 3, but no encoding > declared; see http://www.python.org/peps/pep-0263.html for details > "Data♀♂ Û" > > IronPython does not raise any warning and outputs: > "Data♀♂ ?" > > So it seems, that IronPython is not that close to CPython as I have it > expected. > It takes much more time to run this above simple script in IronPython > as in CPython - it feels as IronPython were extremely busy with > starting itself. > > Claudio Grondi IronPython is a .NET language, so does that mean that it invokes the JIT before running actual code? If so, then "simple short scripts" would take longer with IronPython "busy starting itself" loading .NET and invoking the JIT. This effect would be less noticable, the longer the program is. But I'm just guessing; I've not used IronPython. -- http://mail.python.org/mailman/listinfo/python-list -- http://mail.python.org/mailman/listinfo/python-list
RE: IronPython 1.0 - Bugs or Features?
Warnings is one of the features that didn't quite make it for v1.0. In general w.r.t. non-ASCII characters you'll find IronPython to be more like Jython in that all strings are Unicode strings. But other than that we do support PEP-263 for the purpose of defining alternate file encodings. We're also aware of the startup time and will be working on reducing that in the future. Thanks for the feedback! -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Claudio Grondi Sent: Wednesday, September 06, 2006 1:47 PM To: python-list@python.org Subject: Re: IronPython 1.0 - Bugs or Features? tjreedy wrote: > "Claudio Grondi" <[EMAIL PROTECTED]> wrote in message > news:[EMAIL PROTECTED] > >>I also erroneously assumed, that the first problem was detected during >>parsing ... so, by the way: how can I distinguish an error raised >>while parsing the code and an error raised when actually running the code? > > > Parsing detects and reports syntax errors and maybe something else if > you use non-ascii chars without matching coding cookie. Other errors > are runtime. Let's consider print '"Data ê"' In CPython 2.4.2 there is in case of non-ascii character: sys:1: DeprecationWarning: Non-ASCII character '\xea' in file C:\IronPython-1.0-BugsOrFeatures.py on line 3, but no encoding declared; see http://www.python.org/peps/pep-0263.html for details "Data♀♂ Û" IronPython does not raise any warning and outputs: "Data♀♂ ?" So it seems, that IronPython is not that close to CPython as I have it expected. It takes much more time to run this above simple script in IronPython as in CPython - it feels as IronPython were extremely busy with starting itself. Claudio Grondi -- http://mail.python.org/mailman/listinfo/python-list -- http://mail.python.org/mailman/listinfo/python-list
RE: Determining if an object is a class?
The first check is also off - it should if issubclass(type(Test), type): otherwise you miss the metaclass case: class foo(type): pass class Test(object): __metaclass__ = foo obj = Test if type(obj) == type: 'class obj' else: 'not a class' just on the off-chance you run into a metaclass :) -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Clay Culver Sent: Wednesday, July 12, 2006 2:07 PM To: python-list@python.org Subject: Re: Determining if an object is a class? Ahh much better. Thanks. -- http://mail.python.org/mailman/listinfo/python-list -- http://mail.python.org/mailman/listinfo/python-list
RE: Undocumented alternate form for %#f ?
Ahh, cool... Thanks for the explanation! Do you want to help develop Dynamic languages on CLR? (http://members.microsoft.com/careers/search/details.aspx?JobID=6D4754DE-11F0-45DF-8B78-DC1B43134038) -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Dave Hughes Sent: Friday, April 28, 2006 1:00 PM To: python-list@python.org Subject: Re: Undocumented alternate form for %#f ? Dino Viehland wrote: > I'm assuming this is by-design, but it doesn't appear to be > documented: > > >>> '%8.f' % (-1) > ' -1' > >>> '%#8.f' % (-1) > ' -1.' > > > The docs list the alternate forms, but there isn't one listed for > f/F. It would seem the alternate form for floating points is > truncate & round the floating point value, but always display the . > at the end. Is that correct? The Python % operator follows the C sprintf function pretty darn closely in behaviour (hardly surprising really, though I've never peeked at the implementation). Hence "man sprintf" can provide some clues here. From man sprintf on my Linux box: # The value should be converted to an ``alternate form''. For o conversions, the first character of the output string is made zero (by prefixing a 0 if it was not zero already). For x and X conversions, a non-zero result has the string `0x' (or `0X' for X conversions) prepended to it. For a, A, e, E, f, F, g, and G conversions, the result will always contain a decimal point, even if no digits follow it (normally, a decimal point appears in the results of those conversions only if a digit follows). For g and G conversions, trailing zeros are not removed from the result as they would otherwise be. For other conversions, the result is undefined. Hence, I don't think it's the # doing the truncating here, but it certainly is producing the mandatory decimal point. If you get rid of the "." in the specification, it uses the default decimal precision (6): >>> "%8f" % (-1) '-1.00' >>> "%#8f" % (-1) '-1.00' No difference with the alternate specification here as the precision is non-zero. Again, from man sprintf: The precision [snip] If the precision is given as just `.', or the precision is negative, the precision is taken to be zero. This gives the minimum number of digits to appear for d, i, o, u, x, and X conversions, the number of digits to appear after the radix character for a, A, e, E, f, and F conversions, the maximum number of significant digits for g and G conversions, or the maximum number of characters to be printed from a string for s and S conversions. HTH, Dave. -- -- http://mail.python.org/mailman/listinfo/python-list -- http://mail.python.org/mailman/listinfo/python-list
Undocumented alternate form for %#f ?
I'm assuming this is by-design, but it doesn't appear to be documented: >>> '%8.f' % (-1) ' -1' >>> '%#8.f' % (-1) ' -1.' The docs list the alternate forms, but there isn't one listed for f/F. It would seem the alternate form for floating points is truncate & round the floating point value, but always display the . at the end. Is that correct? Do you want to help develop Dynamic languages on CLR? (http://members.microsoft.com/careers/search/details.aspx?JobID=6D4754DE-11F0-45DF-8B78-DC1B43134038) -- http://mail.python.org/mailman/listinfo/python-list