Is there a difference between python
Is there a difference between the following 2 ways to launch a console-less script under Windows? python
Simple Python github library to push/pull files?
Any recommendations on a library providing a simple interface to Github for basic push/pull type of actions? I know there's a native GitHub RESTful API but wondering if anyone has placed a friendly Python wrapper on top of that interface? PyGithub supports a rich set of actions, but doesn't appear to provide support to push/pull files to/from a repo. Thank you, Malcolm -- https://mail.python.org/mailman/listinfo/python-list
Recommendation on best cross-platform encryption package
Is the cryptography package still considered the "best" cross-platform package for encrypting sensitive data being stored in files on disk. Use case: JSON data files containing potentially confidential/PII data using something along the lines of AES256 encryption. Goal is to encrypt data in memory before saving to disk and to read files into memory and decrypt from there using io.BytesIO streams. Thank you, Malcolm -- https://mail.python.org/mailman/listinfo/python-list
Re: Python 3.6 on Windows - does a python3 alias get created by installation?
Thanks Paul and Dan. @Paul: Yes, it *IS* a bit confusing . Your pip explanation hit the spot. @Dan: Yes, symlinks would be a good work around. Malcolm -- https://mail.python.org/mailman/listinfo/python-list
Python 3.6 on Windows - does a python3 alias get created by installation?
I'm jumping between Linux, Mac and Windows environments. On Linux and Mac we can invoke Python via python3 but on Windows it appears that only python works. Interestingly, Windows supports both pip and pip3 flavors. Am I missing something? And yes, I know I can manually create a python3 alias by copying python.exe to python3.exe but that approach has its own set of nuances on locked down servers plus the hassle of keeping these python3 copies up-to-date across Python updates. Also curious: Do the Windows versions of Python 3.7 and 3.8 provide a python3 alias to start Python? Thanks! Malcolm -- https://mail.python.org/mailman/listinfo/python-list
Re: Application Preferences
Hi Dave, > I agree that a traditional INI file is the easiest way, especially since > Python supports them. So if I understand you, your apps look like this: Yes with the following nuance for our application sized scripts that require multiple configuration files. In this latter case, we place all config files in a ../conf folder (see the OR option below). -App Dir | +-App Code Folder (src) | +-App INI file OR -App Dir | +-conf | +-App INI file AND -Home Folder (the default location, but user can select other locations) | +-App INI Folder | +-App INI file Malcolm -- https://mail.python.org/mailman/listinfo/python-list
Re: Application Preferences
Hi Dave, > The plan for an app that I'm doing was to use SQLite for data and to hold the > preference settings as some apps do. The plan was changed last week to allow > the user to select the location of the data files (SQLite) rather than > putting it where the OS would dictate. Ok with that, but it brings up some > questions. First, I will need to have a file that points to the location of > the data file since that can't be hard coded. Second, if I have to have a > file that is likely part of the application group of files, would it make > more sense to use a more traditional preferences file? How have other Python > app developers done in this case? We handle the "where is my config file" question by defaulting to script's current directory, then a script-specific folder within their home directory. Users can override that behavior by setting a script specific environment variable or by using a command line switch to point to a different location or config file. We store our preferences in an INI style config file which we've found easier to support when there's problems. Malcolm -- https://mail.python.org/mailman/listinfo/python-list
Re: Books for Python 3.7
Python Cookbook; highly recommended. Malcolm -- https://mail.python.org/mailman/listinfo/python-list
Re: pyodbc -> MS-SQL Server Named Instance ?
You may need to update your ODBC driver from version 13 to 17.x. Malcolm -- https://mail.python.org/mailman/listinfo/python-list
Re: Looking for tips and gotchas for working with Python 3.5 zipapp feature
> I am exactly in the "pretty advanced usage": I want to create a zip that > embed numpy. In this case, I have to bundle the C extension. How can I do > that? 1. PyInstaller 2. PyOxide (new technology, may or may not support Numpy) Let us know how you make out. Malcolm -- https://mail.python.org/mailman/listinfo/python-list
Re: change spacing to two instead of four with pep8 or flake8?
> I've also taken to having my files auto-formatted with yapf on save ... @Cameron: Have you considered black at all and if so, what are your thoughts? Thanks, Malcolm -- https://mail.python.org/mailman/listinfo/python-list
Re: pip vs python -m pip?
> From: Chris Angelico > Are you doing this in cmd.exe, powershell, bash, or some other shell? Same result via cmd.exe and PowerShell (ps1). > There are a LOT of ways that the Windows path can fail to pick up the correct > 'pip'. Normally activating a venv should let you use "pip" to mean the right > thing just as "python" does, but maybe something's cached? I think you're on to something. Running pip as a package (python -m pip) will force the use of the virtual env copy of pip. Running pip as an application vs package may use the system version of pip. Malcolm -- https://mail.python.org/mailman/listinfo/python-list
Re: How to force "python -m site" ENABLE_USER_SITE to false?
> From: Ed Leafe > StackOverflow: > https://stackoverflow.com/questions/25584276/how-to-disable-site-enable-user-site-for-an-environment Thanks Ed! My takeaway from the above article and our path going forward is to explictly force ENABLE_USER_SITE to false via Python's "-s" switch. Malcolm -- https://mail.python.org/mailman/listinfo/python-list
Re: pip vs python -m pip?
> you must be picking up pip from a different python install (or virtualenv) > than you are picking up python. > Check your %PATH% That was our first guess. Only one version of Python installed on the system (we install on an empty, freshly serviced pack Windows VM). Only one version of python*.exe found via Explorer. This behavior observed across multiple Windows 2016 Enterprise servers and Windows 10 Professional desktops. Malcolm -- https://mail.python.org/mailman/listinfo/python-list
How to force "python -m site" ENABLE_USER_SITE to false?
Any suggestions on how one can force the "python -m site" ENABLE_USER_SITE value to false? Is it possible to globally force this setting - across all users - when installing a system wide version of Python ... or via a command line option when starting a Python session? Motivation: When ENABLE_USER_SITE is true, packages can get accidentally installed in user specific Python\Python3XX\site-packages folder, overriding system packages and ... apparently (at least under Windows) ... virtual environment packages as well. Thank you, Malcolm -- https://mail.python.org/mailman/listinfo/python-list
pip vs python -m pip?
64-bit Python 3.6.8 running on Windows with a virtual environment activated. "pip -v" reports 19.0.3 "python -m pip" reports 19.1.1 Is this behavior by design or a bug? My takeaway is that its better to run "python -m pip ..." vs "pip ..." when running pip related tasks. Thoughts? Malcolm -- https://mail.python.org/mailman/listinfo/python-list
What's the latest best practice on Python project directory layouts?
I have a collection of command line scripts that share a collection of common modules. This code collection is for internal use and will run under a single version of Python 3.6+ and a single OS. My understanding of best practice is to organize this collection of Python files into a folder structure like this: # common files .gitignore readme.md requirements.txt setup.py <--- what is the advantage of this file for internally distributed code bases? # app specific package folders app-1 __init__.py (optional; if needed) __main__.py app-1-module-1.py app-1-module-2.py app-1-module-N.py app-2 __init__.py (optional; if needed) __main__.py app-2-module-1.py app-2-module-2.py app-2-module-N.py # modules shared across multiple apps common common-module-1.py common-module-2.py common-module-N.py # tests - place at package level with sub-packages for each package -OR- underneath each app package? tests app-1 test_app-1-module-1.py test_app-1-module-2.py test_app-1-module-N.py app-2 test_app-2-module-1.py test_app-2-module-2.py test_app-2-module-N.py # virtual env folder placed at same level as packages ??? venv And execute each app via the following ... python -m app-1 Questions 1. Does the above structure sound reasonable? 2. Where to place virtual env files and what to call this folder? venv, .env, etc? 3. Where to put tests (pytest)? In a tests folder or under each package? 4. Use a src folder or not? If so, where to put above files relative to the src folder? Malcolm -- https://mail.python.org/mailman/listinfo/python-list
Re: Block Ctrl+S while running Python script at Windows console?
> This has nothing to do with Python does it? Isn't Python is just writing to > stdout and those write calls are blocking due because the terminal emulator > has stopped reading the other end of the > pipe/pty/queue/buffer/whatever-it's-called-in-windows? You're right. But I wasn't sure. I know Python can trap Ctrl+C break signals, so I wondered if there was a similar tie-in for Ctrl+S. Eryk did a great job explaining the tech details of what's happening behind the scenes in the Windows world. In my case it turned out that my clicking between windows was inadvertently selecting text pausing the running process. Unchecking the Windows console's Quick Edit mode stops this behavior. Malcolm -- https://mail.python.org/mailman/listinfo/python-list
Re: Block Ctrl+S while running Python script at Windows console? (solved)
Eryk, > Another common culprit is quick-edit mode, in which case a stray mouse click > can select text, even just a single character. The console pauses while text > is selected. MYSTERY SOLVED !! THANK YOU !! Apparently, while mouse clicking between windows, I was inadvertently selecting a char on my console, thereby pausing the process that was running. Disabling Quick-Edit mode via the Properties dialog fixes this behavior. I tried other console property setting combinations to block Ctrl+S keyboard behavior, but without success. Not a problem since I'm pretty sure I'm not typing random Ctrl+S keystrokes while working. Thanks again Eryk! Malcolm -- https://mail.python.org/mailman/listinfo/python-list
Block Ctrl+S while running Python script at Windows console?
I'm running some Python 3.6 scripts at the Windows 10/Windows Server 2016 console. In my every day workflow, I seem to be accidentally sending Ctrl+S keystrokes to some of console sessions, pausing my running scripts until I send another corresponding Ctrl+S to un-pause the affected scripts. My challenge is I don't know when I'm sending these keystrokes other than seeing scripts that seem to have stopped, clicking on their console window, and typing Ctrl+S to unblock them. Wondering if there's a way to have my Python scripts ignore these Ctrl+S signals or if this behavior is outside of my Python script's control. If there's a way to disable this behavior under Windows 10/Windows Server 2016, I'm open to that as well. Thank you, Malcolm -- https://mail.python.org/mailman/listinfo/python-list
Multiprocessing vs subprocess to run Python scripts in parallel
Use case: I have a Python manager script that monitors several conditions (not just time based schedules) that needs to launch Python worker scripts to respond to the conditions it detects. Several of these worker scripts may end up running in parallel. There are no dependencies between individual worker scripts. I'm looking for the pros and cons of using multiprocessing or subprocess to launch these worker scripts. Looking for a solution that works across Windows and Linux. Open to using a 3rd party library. Hoping to avoid the use of yet another system component like Celery if possible and rational. My understanding (so far) is that the tradeoff of using multiprocessing is that my manager script can not exit until all the work processes it starts finish. If one of the worker scripts locks up, this could be problematic. Is there a way to use multiprocessing where processes are launched independent of the parent (manager) process? Thank you, Malcolm -- https://mail.python.org/mailman/listinfo/python-list
Convert Windows paths to Linux style paths
Looking for best practice technique for converting Windows style paths to Linux paths. Is there an os function or pathlib method that I'm missing or is it some combination of replacing Windows path separators with Linux path separators plus some other platform specific logic? Thank you, Malcolm -- https://mail.python.org/mailman/listinfo/python-list
Re: Best way to remove unused pip installed modules/module dependencies from a virtual env?
[Reformatted as original post got mangled] Looking for advice on the best way to remove unused modules from a Python virtual environment. My setup is Python 3.6.6 running on macOS although I believe my use case is OS independent. Background: Long running project that, over the course of time, pip installed modules that are no longer used by the code. I'm looking for a way to identity unused modules and remove them. Here's my back-of-napkin strategy to do this. Wondering if there are holes in this approach or if there's an off-the-shelf solution for my use case? 1. pip freeze > modules.txt 2. build a list of all import statements, extract out module names 3. remove these module names from modules.txt and add to used-modules.txt 4. modules that remain in modules.txt are either module dependencies of directly imported modules or no longer used 5. remove my virtual env and recreate it again to start with a fresh env 6. reinstall each directly imported module (from list in used-modules.txt); this will pull in dependencies again 7. pip freeze > requirements.txt <--- this should be the exact modules used by our code Thank you, Malcolm -- https://mail.python.org/mailman/listinfo/python-list
Best way to remove unused pip installed modules/module dependencies from a virtual env?
Looking for advice on the best way to remove unused modules from a Python virtual environment. My setup is Python 3.6.6 running on macOS although I believe my use case is OS independent. Background: Long running project that, over the course of time, pip installed modules that are no longer used by the code. I'm looking for a way to identity unused modules and remove them. Here's my back-of-napkin strategy to do this. Wondering if there are holes in this approach or if there's an off-the-shelf solution for my use case? 1. pip freeze > modules.txt 2. build a list of all import statements, extract out module names 3. remove these module names from modules.txt and add to used- modules.txt4. modules that remain in modules.txt are either module dependencies of directly imported modules or no longer used5. remove my virtual env and recreate it again to start with a fresh env6. reinstall each directly imported module (from list in used- modules.txt); this will pull in dependencies again7. pip freeze > requirements.txt <--- this should be the exact modules used by our code Thank you, Malcolm -- https://mail.python.org/mailman/listinfo/python-list
Normalizing path strings and separators in cross-platform unit test scripts
Any recommendations on normalizing path strings in cross platform (Windows, Linux, macOS) for unit tests? Our goal is to normalize path strings to use forward slash separators so that we can consistently reference path strings in our unit tests in a cross platform way. Example: Under Windows we have two paths that are logically the same but fail to match for test purposes. assert str(full_path(f'{test_folder_path}/readonly.txt')) == 'C:/udp-app- master/dev/tmp/readonly.txt'E AssertionError: assert 'C:\\udp-app-...\readonly.txt' == 'C:/udp-app- ma.../readonly.txt' Is there a best practice way to convert Windows style paths (with backslash path separators) to Linux style paths with forward slash path separators? I've looked at the os and pathlib libraries without seeing anything that describes our need. Any downsides to this approach? Thank you -- https://mail.python.org/mailman/listinfo/python-list
Best practice for upgrading SQLite C library (DLL, SO, etc) that ships with Python
I noticed that there's a rather big gap between the latest version of SQLite and the version of SQLite that ships with Python 3.6/3.7. Is there best practice advice for upgrading the SQLlite C library that ships with Python ... without causing havoc and mayhem on my system? Options Don't do it - the universe will split Do it - just replace the DLL/SO library in your Python installation's folderDo it - but rename the updated version so as not to overwrite the default SQLlite library?Do it - using some type of virtual environment magic so your change is truly isolated Other? Are there OS specific issues to be concerned with or is there a general pattern here? I work across Windows, Linux, and macOS. Thank you, Malcolm -- https://mail.python.org/mailman/listinfo/python-list
Logging - have logger use caller's function name and line number
I have a class method that adds additional context to many of the class's log messages. Rather than calling, for example, logger.info( 'message ...' ) I would like to call my_object.log( 'message ...' ) so that this additional context is managed in a single place. I'm actually doing that and its working great except that ... as expected ... all my log records have the my_object.log method's name and line numbers vs the callers' names and line numbers. Is there a technique I can use to have my custom log method inject the caller's name and line numbers in the log output? -- https://mail.python.org/mailman/listinfo/python-list
What Python related git pre-commit hooks are you using?
Curious to learn what Python related git pre-commit hooks people are using? What hooks have you found useful and which hooks have you tried and abandoned? Appreciate any suggestions for those new to this process. Background: Window, macOS, and Linux dev environments, PyCharm professional edition IDE, 64-bit Python 3.6, private Github repos. Considering black (standardize formatting), pylamas (multiple static code tests) and possibly a hook into our pytest test runner. Thanks! -- https://mail.python.org/mailman/listinfo/python-list
Anyone running Python on MS Azure?
Curious to hear if any of you are running Python scripts/apps on MS Azure cloud services? What services are you using and what has your experience been? Advice? Background: Customer migrating to Azure. I'm trying to get ahead of the curve regarding how Python-friendly the Azure platform is. Thanks! -- https://mail.python.org/mailman/listinfo/python-list
Verify pip's requirements.txt file at runtime?
Is there a technique that would allow a script to verify its requirements.txt file before it runs? Use case: To detect unexpected changes to a script's runtime environment. Does the pip module expose any methods for this type of task? Thank you, Malcolm -- https://mail.python.org/mailman/listinfo/python-list
logging module - how to include method's class name when using %(funcName)
I'm using the Python logging module and looking for a way to include a method's class name when using %(funcName). Is this possible? When you have several related classes, just getting the function (method) name is not enough information to provide context on the code being executed. I'm outputting module name and line number so I can always go back and double check a caller's location in source, but that seems like an archaic way to find this type of information. Thank you, Malcolm -- https://mail.python.org/mailman/listinfo/python-list
Cross platform mutex to prevent script running more than instance?
Use case: Want to prevent 2+ instances of a script from running ... ideally in a cross platform manner. I've been researching this topic and am surprised how complicated this capability appears to be and how the diverse the solution set is. I've seen solutions ranging from using directories, named temporary files, named sockets/pipes, etc. Is there any consensus on best practice here? Thank you, Malcolm -- https://mail.python.org/mailman/listinfo/python-list
Cross platform mutex to prevent script running more than instance?
Use case: Want to prevent 2+ instances of a script from running ... ideally in a cross platform manner. I've been researching this topic and am surprised how complicated this capability appears to be and how the diverse the solution set is. I've seen solutions ranging from using directories, named temporary files, named sockets/pipes, etc. Is there any consensus on best practice here? Thank you, Malcolm -- https://mail.python.org/mailman/listinfo/python-list
Re: Verifying the integrity/lineage of a file
Thanks for the replies! I'm going to investigate the use of python-gnupg [1] which is a Python wrapper for the GPG command line utility. This library is based on gpg.py written by Andrew Kuchling. I'm all ears if f anyone has any alternative recommendations or python-gnupg tips to share. BTW: Target clients are running under Windows and Linux. [1] https://pythonhosted.org/python-gnupg/ -- https://mail.python.org/mailman/listinfo/python-list
bytes() or .encode() for encoding str's as bytes?
Is there a benefit to using one of these techniques over the other? Is one approach more Pythonic and preferred over the other for style reasons? message = message.encode('UTF-8') message = bytes(message, 'UTF-8') Thank you, Malcolm -- https://mail.python.org/mailman/listinfo/python-list
Verifying the integrity/lineage of a file
I have use case where I need to distribute binary files to customers and want to provide a way for our customers to verify the "integrity/lineage" (I know there's a better description, but can't think of it) of these files, eg. to give them the confidence that the files in question are from me and haven't been altered. Here's the methods I can think of using Python: 1. Use hashlib to hash each file (SHA256)+ and send the hashes separately for verification2. Use hmac to sign each file 3. Use a 3rd party crypto library to sign each file and use a set of public/private SSH keys for verification Any suggestions on techniques and/or libraries appreciated. Thank you, Malcolm -- https://mail.python.org/mailman/listinfo/python-list
Re: How to pass Python command line options (vs arguments) when running script directly vs via Python interpreter?
Thanks to everyone who contributed to this thread. Great feedback and suggestions! - Malcolm -- https://mail.python.org/mailman/listinfo/python-list
Re: How to pass Python command line options (vs arguments) when running script directly vs via Python interpreter?
> You might try: > from getopt import getopt > or the (apparently newer): > from optparse import OptionParser Thanks Mike. My question was trying to make a distinction between Python options (flags that precede the script or module name) and arguments (the script specific values passed on the command line following the script's name). Here's a description of the options I'm referring to: https://docs.python.org/3/using/cmdline.html#generic-options -- https://mail.python.org/mailman/listinfo/python-list
Re: How to pass Python command line options (vs arguments) when running script directly vs via Python interpreter?
> If you run the script directly, by entering >script.py or clicking a script > icon or name in File Explorer, it runs python without python options *other > than those specified in environmental variables*. Understood. I thought there might have been a way to pass Python option values via a single environment variable (something like PYTHONOPTIONS) vs individual environment variables. Thank you Malcolm -- https://mail.python.org/mailman/listinfo/python-list
How to pass Python command line options (vs arguments) when running script directly vs via Python interpreter?
When you run a script via "python3 script.py" you can include command line options like -b, -B, -O, -OO, etc between the "python3" interpreter reference and the script.py file, eg. "python3 -b -B -O -OO script.py". When you create a script that is executable directly, eg. script.py with execution bit set on Linux or on Windows where the .py file extension is associated with a specific Python executable, there doesn't appear to be a way to pass command line options to the script. In this later case, how can I pass my script command line options without having these options confused with command line arguments? Thank you, Malcolm -- https://mail.python.org/mailman/listinfo/python-list
Anyone using cloud based monitoring/logging services with Python logging module?
Looking for feedback on anyone who's using a cloud based monitoring/logging service with Python's standard lib logging module, eg. services such as Datadog, Loggly, Papertrailapp, Scalyr, LogDNA, Logz.io, Logentries, Loggr, Logstats? Would appreciate hearing your experience, pros, cons, recommendations, etc. Thanks! Malcolm -- https://mail.python.org/mailman/listinfo/python-list
Distributing Python virtual environments
We're using virtual environments with Python 3.6. Since all our pip installed modules are in our environment's local site-packages folder, is the distribution of a virtual environment as simple as recursively zipping the environment's root folder and distributing that archive to another machine running the same OS and same version of Python? The reason we would go this route vs downloading dependencies from a requirements.txt file is that the target machines may be in a private subnet with minimal access to the internet. -- https://mail.python.org/mailman/listinfo/python-list
Re: Python 3.6: How to expand f-string literals read from a file vs inline statement
> Perhaps it doesn't need to be said, but just to be sure: don't use eval if > you don't trust the people writing the configuration file. They can do nearly > unlimited damage to your environment. They are writing code that you are > running. Of course! Script and config file are running in a private subnet and both are maintained by a single developer. -- https://mail.python.org/mailman/listinfo/python-list
Best practice for managing secrets (passwords, private keys) used by Python scripts running as daemons
Looking for your suggestions on best practice techniques for managing secrets used by Python daemon scripts. Use case is Windows scripts running as NT Services, but interested in Linux options as well. Here's what we're considering 1. Storing secrets in environment vars 2. Storing secrets in config file only in subfolder with access limited to daemon account only3. Using a 3rd party vault product Thanks -- https://mail.python.org/mailman/listinfo/python-list
Re: Python 3.6: How to expand f-string literals read from a file vs inline statement
My original post reformatted for text mode: Looking for advice on how to expand f-string literal strings whose values I'm reading from a configuration file vs hard coding into my script as statements. I'm using f-strings as a very simple template language. I'm currently using the following technique to expand these f-strings. Is there a better way? Bonus if anyone has suggestions on how my expand() function can access the locals() value of the caller so this parameter doesn't have to be passed in explicitly. def expand(expression, values): """Expand expression using passed in locals()""" triple_quote = "'" * 3 expression = dedent(expression) return eval(f'f{triple_quote}{expression}{triple_quote}', None, values) product_name = 'Bike ABC' product_sku = '123456' product_price = 299.95 discount = 0.85 # read in product description template # product_description_template might look like: {product_sku} : {product_name}: ${product_price * discount} product_description_template = config('product_description_template') # expand the {expressions} in product_description_template using my locals() product_description = expand(product_description_template, locals()) -- https://mail.python.org/mailman/listinfo/python-list
Python 3.6: How to expand f-string literals read from a file vs inline statement
Looking for advice on how to expand f-string literal strings whose values I'm reading from a configuration file vs hard coding into my script as statements. I'm using f-strings as a very simple template language. I'm currently using the following technique to expand these f-strings. Is there a better way? Bonus if anyone has suggestions on how my expand() function can access the locals() value of the caller so this parameter doesn't have to be passed in explicitly. *def *expand(expression, values): """Expand expression using passed in locals()""" triple_quote = *"'" ** 3 expression = dedent(expression) *return *eval(*f'f{triple_quote}{expression}{triple_quote}'*, *None*, values) product_name = 'Bike ABC' product_sku = '123456' product_price = 299.95 discount = 0.85 # read in product description template # product_description_template might look like: {product_sku} : # {product_name}: ${product_price * discount}product_description_template = config('product_description_template') # expand the {expressions} in product_description_template using my # locals()product_description = expand(product_description_template, locals()) -- https://mail.python.org/mailman/listinfo/python-list
Re: Standard lib version of something like enumerate() that takes a max count iteration parameter?
Thank you Peter and Jussi - both your solutions were very helpful! Malcolm -- https://mail.python.org/mailman/listinfo/python-list
Standard lib version of something like enumerate() that takes a max count iteration parameter?
Wondering if there's a standard lib version of something like enumerate() that takes a max count value? Use case: When you want to enumerate through an iterable, but want to limit the number of iterations without introducing if-condition-break blocks in code. Something like: for counter, key in enumerate( some_iterable, max_count=10 ): Thank you, Malcolm -- https://mail.python.org/mailman/listinfo/python-list
Best practice tips for sizing thread/process pools and concurrent futures max_workers?
Looking for best practice tips on how to size thread/process pools or max workers with the concurrent futures module. Are there specific heuristics that can be applied or is this more a manual tuning process based on the run time behavior of a script and the nuances of one's environment? Are there any 3rd party libraries that monitor CPU/core and memory utilization to dynamically tune these values based on system capacity? Thank you, Malcolm -- https://mail.python.org/mailman/listinfo/python-list
Elegant way to merge dicts without overwriting keys?
I have a bunch of pickled dicts I would like to merge. I only want to merge unique keys but I want to track the keys that are duplicated across dicts. Is there a newer dict-like data structure that is fine tuned to that use case? Short of an optimized data structure, my plan is to convert dict keys to sets and compare these sets to determine which keys are unique and can be merged and which keys are dupes and should be tracked in that manner. At a high level, does this sound like a reasonable approach? Thank you, Malcolm -- https://mail.python.org/mailman/listinfo/python-list
Re: Who still supports recent Python on shared hosting
Another endorsement for Webfaction. -- https://mail.python.org/mailman/listinfo/python-list
Quick way to calculate lines of code/comments in a collection of Python scripts?
Looking for a quick way to calculate lines of code/comments in a collection of Python scripts. This isn't a LOC per day per developer type analysis - I'm looking for a metric to quickly judge the complexity of a set of scripts I'm inheriting. Thank you, Malcolm -- https://mail.python.org/mailman/listinfo/python-list
Re: Looking for tips and gotchas for working with Python 3.5 zipapp feature
Hi Paul, > Just one further note, which may or may not be obvious. If your application > uses external dependencies from PyPI, you can bundle them with your > application using pip's --target option ... Cool stuff! To your question: None of what you've shared has been obvious to me :) Packaging and distributing Python scripts as zipped archives is such a powerful feature I'm surprised that there hasn't been more written on this topic. Thank you for sharing these tips with me and the rest of the Python list community !! Malcolm -- https://mail.python.org/mailman/listinfo/python-list
Re: Obtain the raw line of text read by CSVDictReader when reporting errors?
Oscar/MRAB, > You could put something between the file and the reader ... Thank you both for your suggestions ... brilliant! You guys helped me solve my problem and gave me an excellent strategy for future scenarios. Malcolm -- https://mail.python.org/mailman/listinfo/python-list
Obtain the raw line of text read by CSVDictReader when reporting errors?
Looking for ideas on how I can obtain the raw line of text read by a CSVDictReader. I've reviewed the CSV DictReader documentation and there are no public attributes that expose this type of data. My use case is reporting malformed lines detected when my code validates the dict of data returned by this object. I would like to log the actual line read by the CSVDictReader, not the processed data returned in the dict. Thank you, Malcolm -- https://mail.python.org/mailman/listinfo/python-list
Re: Looking for tips and gotchas for working with Python 3.5 zipapp feature
Hi Paul, WOW!:) I really appreciate the detailed response. You answered all my questions. I'm looking forward to testing out your pylaunch wrapper. Thank you very much! Malcolm -- https://mail.python.org/mailman/listinfo/python-list
Looking for tips and gotchas for working with Python 3.5 zipapp feature
Looking for tips or edge case gotchas associated with using Python 3.5's new zipapp feature. For those of you wondering what this feature is, see the end of this post for a brief background [1]. Two questions in particular: 1. Are there any issues with deploying scripts that sit in sub- folders beneath the directory being zipped, eg. does zipapp only support a flat folder of scripts or does it recursively zip and path sub-folders? 2. Can additional non-Python files like config files be added to a zipapp without breaking them and if so, how would your script reference these embedded files (by opening up the zipapp as a zip archive and navigating from there?). Thank you, Malcolm [1] The zipapp feature of Python 3.5 is pretty cool: It allows you to package your Python scripts in a single executable zip file. This isn't a replacement for tools like PyInstaller or Py2Exe, eg. it doesn't bundle the Python interpreter in the zip file, but it's a clean way to distribute multi-file scripts in environments where you have control over users' Python setups. Here's the manual page: zipapp — Manage executable python zip archives https://docs.python.org/3/library/zipapp.html -- https://mail.python.org/mailman/listinfo/python-list
Re: Discover all non-standard library modules imported by a script
Thanks for your suggestions Chris and Terry. The answer I was looking for is the modulefinder module which is part of the standard lib. Works like a charm! Quote: This module provides a ModuleFinder class that can be used to determine the set of modules imported by a script. modulefinder.py can also be run as a script, giving the filename of a Python script as its argument, after which a report of the imported modules will be printed. https://docs.python.org/3.5/library/modulefinder.html Note there's a similar module for Python 2.7. -- Malcolm -- https://mail.python.org/mailman/listinfo/python-list
Discover all non-standard library modules imported by a script
Looking for suggestions on how, given a main script, discover all the non-standard library modules imported across all modules, eg. the modules that other modules import, etc. I'm not looking to discover dynamic imports or other edge cases, just the list modules loaded via "import " and "from import ...". I know I could write a script to do this, but certainly there must be such a capability in the standard library? Use case: Discovering list of modules to use for building a Python 3.5 zipapp distributable. Thank you, Malcolm -- https://mail.python.org/mailman/listinfo/python-list
Re: Dynamically import specific names from a module vs importing full module
Ned and Random832, Ned: Thank you - your example answered my question. Random832: Thank you for the reminder about "from import " still importing the module. Yes, I had forgotten that behavior. Best, Malcolm -- https://mail.python.org/mailman/listinfo/python-list
Dynamically import specific names from a module vs importing full module
Python 3.5: Is there a way to dynamically import specific names from a module vs importing the full module? By dynamic I mean via some form of importlib machinery, eg. I'm looking for the dynamic "from import " equivalent of "import "'s importlib.import_module. Thank you, Malcolm -- https://mail.python.org/mailman/listinfo/python-list
Advice on optimizing a Python data driven rules engine
Background: I'm building a rules engine for transforming rows of data being returned by csv DictReader, eg. each row of data is a dict of column name to value mappings. My rules are a list of rule objects whose attributes get referenced by rule specific methods. Each rule has an associated method which gets called based on rule name. Looking for some advice on how to optimize the BOILERPLATE portions of the following type of code. There's an awful lot of dot dereferencing going on. One thought was to pass in the values being dereferenced as parameters and return value. But this would just move the dereferencing to another point in my program and would add the overhead of parameter passage. Is there a technique where I could store a reference to these values that would make their access more efficient? def action_strip(self): # BOILERPLATE: lookup value being transformed (possible to get a # direct 'pointer' to this dic entry???) value = self.data[self.rule.target_column] # BOILERPLATE: get rule's hard coded parameter match = self.rule.value.strip() # perform a generic transformation action while value.startswith(match): value = value.replace(match, '', 1).strip() while value.endswith(match): value = value[0:len(value) - len(match)].strip() # BOILERPLATE: update the value we just transformed self.data[self.rule.target_column] = value Thank you, Malcolm -- https://mail.python.org/mailman/listinfo/python-list
Re: Capturing the bad codes that raise UnicodeError exceptions during decoding
Wow!!! A huge thank you to all who replied to this thread! Chris: You gave me some ideas I will apply in the future. MRAB: Thanks for exposing me to the extended attributes of the UnicodeError object (e.start, e.end, e.object). Mike: Cool example! I like how _cleanlines() recursively calls itself to keep cleaning up a line after an error is handled. Your code solved the mystery of how to recover from a UnicodeError and keep decoding. Random832: Your suggestion to write a custom codecs handler was great. Sample below for future readers reviewing this thread. # simple codecs custom error handler import codecs def custom_unicode_error_handler(e): bad_bytes = e.object[e.start:e.end] print( 'Bad bytes: ' + bad_bytes.hex()) return ('', e.end) codecs.register_error('custom_unicode_error_handler', custom_unicode_error_handler) Malcolm -- https://mail.python.org/mailman/listinfo/python-list
Re: Capturing the bad codes that raise UnicodeError exceptions during decoding
Hi Chris, Thanks for your suggestions. I would like to capture the specific bad codes *before* they get replaced. So if a line of text has 10 bad codes (each one raising UnicodeError), I would like to track each exception's bad code but still return a valid decode line when finished. My goal is to count the total number of UnicodeExceptions within a file (as a data quality metric) and track the frequency of specific bad code's (via a collections.counter dict) to see if there's a pattern that can be traced to bad upstream process. Malcolm Remove them? Not sure what you mean, exactly; but would an errors="backslashreplace" decode do the job? Something like (assuming you use Python 3): def read_dirty_file(fn): with open(fn, encoding="utf-8", errors="backslashreplace") as f: for row in csv.DictReader(f): process(row) You'll get Unicode text, but any bytes that don't make sense in UTF-8 will be represented as eg \x80, with an actual backslash. Or use errors="replace" to hide them all behind U+FFFD, or other forms of error handling. That'll get done at a higher level than the CSV reader, like you suggest. -- https://mail.python.org/mailman/listinfo/python-list
Capturing the bad codes that raise UnicodeError exceptions during decoding
I'm processing a lot of dirty CSV files and would like to track the bad codes that are raising UnicodeErrors. I'm struggling how to figure out what the exact codes are so I can track them, them remove them, and then repeat the decoding process for the current line until the line has been fully decoded so I can pass this line on to the CSV reader. At a high level it seems that I need to wrap the decoding of a line until it passes with out any errors. Any suggestions appreciated. Thank you, Malcolm -- https://mail.python.org/mailman/listinfo/python-list
Re: print() function with encoding= and errors= parameters?
> You could use win_unicode_console enabled in sitecustomize or usercustomize. > https://pypi.python.org/pypi/win_unicode_console The pypi link you shared has an excellent summary of the issues associated when working Unicode from the Windows terminal. Thank you Eryk. Malcolm -- https://mail.python.org/mailman/listinfo/python-list
Re: print() function with encoding= and errors= parameters?
Chris, > Don't forget that the print function can simply be shadowed. I did forget! Another excellent option. Thank you! Malcolm -- https://mail.python.org/mailman/listinfo/python-list
Re: print() function with encoding= and errors= parameters?
Thank you Random832 and Peter - excellent ideas. My use case was diagnostic output being (temporarily) output to stdout via debug related print statements. The output is for debugging only and not meant for production. I was looking for a solution that would allow me to output to the console with as few changes to the original scripts as possible, eg. non-invasive except for the print statements themselves. When debugging under Linux/OSX, standard print statements work fine because their stdouts' encoding is UTF-8. But under Windows, the stdout is workstation specific and *never* UTF-8. So the occasional non-ASCII string trips up our diagnostic output when tested under Windows. Peter's suggestion to set the PYTHONIOENCODING [1] environment variable is the non-invasive, diagnostic only, non-production solution I was looking for ... for the use case at hand. Again, thank you both. Malcolm [1] PYTHONIOENCODING=ascii:backslashreplace -- https://mail.python.org/mailman/listinfo/python-list
print() function with encoding= and errors= parameters?
Looking for a way to use the Python 3 print() function with encoding and errors parameters. Are there any concerns with closing and re-opening sys.stdout so sys.stdout has a specific encoding and errors behavior? Would this break other standard libraries that depend on sys.stdout being configured a specific way? Thank you, Malcolm -- https://mail.python.org/mailman/listinfo/python-list
Re: logging: getLogger() or getLogger(__name__)?
Thank you Laurent! - Original message - From: Laurent Pointal With __name__ you will have one logger per source file (module), with corresponding filtering possibilities, and organized hierarchically as are packages (logging use . to built its loggers hierarchy). Without __name__, you have one global default logger. -- https://mail.python.org/mailman/listinfo/python-list
Re: Behavior of tempfile temp files when scripts killed, interpreter crashes, server crashes?
Hi Eryk, Awesome! Thank you very much for your detailed answer!! Malcolm Linux has the O_TMPFILE open() flag [1]. This creates an anonymous file that gets automatically deleted when the last open file descriptor is closed. If the file isn't opened O_EXCL, then you can make it permanent by linking it back into the filesystem. For example: ... -- https://mail.python.org/mailman/listinfo/python-list
Behavior of tempfile temp files when scripts killed, interpreter crashes, server crashes?
Can someone share their OS specific experience in working with tempfile generated temp files under these conditions? 1. Script killed by another process 2. Interpreter crashes 3. Server crashes (sudden loss of power) 4. Other application termination conditions ??? I'm curious which scenarios result in temp files not being automatically deleted after use and what technique you're using to cleanup temp files left over after these conditions (without affecting legitimate temp files present from the current session)? Do any OS's support a type of temp file that truly gets automatically deleted in all of the above scenarios? Thank you, Malcolm -- https://mail.python.org/mailman/listinfo/python-list
logging: getLogger() or getLogger(__name__)?
I've read that best practice for logging is to place the following line at the top of all modules: logger = getLogger(__name__) I'm curious why the following technique wouldn't be a better choice: logger = getLogger() Are there scenarios that favor one style over another? Thank you, Malcolm -- https://mail.python.org/mailman/listinfo/python-list
Re: Python 3.5 glob.glob() 2nd param (*) and how to detect files/folders beginning with "."?
Hi Jussi, You answered my questions - thank you! Malcolm > 1. The signature for glob.glob() is "glob.glob(pathname, *, >recursive=False)". What is the meaning of the 2nd parameter listed >with an asterisk? It's not a parameter. It's special syntax to indicate that the remaining parameters are keyword-only. > 2. Is there a technique for using glob.glob() to recognize files and >folders that begin with a period, eg. ".profile"? The documentation >states: "If the directory contains files starting with . they won’t >be matched by default.". Any suggestions on what the non-default >approach is to match these type of files? Glob with a pattern that starts with a dot. Glob twice if you want both kinds. Or look into that fnmatch that is referenced from glob documentation and said not to consider leading dots special. -- https://mail.python.org/mailman/listinfo/python-list
Python 3.5 glob.glob() 2nd param (*) and how to detect files/folders beginning with "."?
In reading Python 3.5.1's glob.glob documentation[1] I'm puzzled by the following: 1. The signature for glob.glob() is "glob.glob(pathname, *, recursive=False)". What is the meaning of the 2nd parameter listed with an asterisk? 2. Is there a technique for using glob.glob() to recognize files and folders that begin with a period, eg. ".profile"? The documentation states: "If the directory contains files starting with . they won’t be matched by default.". Any suggestions on what the non-default approach is to match these type of files? Thank you, Malcolm [1] https://docs.python.org/3/library/glob.html -- https://mail.python.org/mailman/listinfo/python-list
Re: Possible to capture cgitb style output in a try/except section?
Hi Steven and Peter, Steven: Interestingly (oddly???) enough, the output captured by hooking the cgitb handler on my system appears to be shorter than the default cgitb output. You can see this yourself via this tiny script: import cgitb cgitb.enable(format='text') x = 1/0 The solution I came up with (Python 3.5.1) was to use your StringIO technique with the Hook's 'file=' parameter. import io cgitb_buffer = io.StringIO() cgitb.Hook(file=cgitb_buffer, format='text) return cgitb_buffer.getvalue() Peter: Your output was helpful in seeing the difference I mentioned above. Thank you both for your help! Malcolm -- https://mail.python.org/mailman/listinfo/python-list
Possible to capture cgitb style output in a try/except section?
Is there a way to capture cgitb's extensive output in an except clause so that cgitb's detailed traceback output can be logged *and* the except section can handle the exception so the script can continue running? My read of the cgitb documentation leads me to believe that the only way I can get cgitb output is to let an exception propagate to the point of terminating my script ... at which point cgitb grabs the exception and does its magic. Thank you, Malcolm -- https://mail.python.org/mailman/listinfo/python-list
Re: Caller's module name, function/method name and line number for output to logging
Hi Peter, > Fine! Then you can avoid the evil hack I came up with many moons ago: > https://mail.python.org/pipermail/python-list/2010-March/570941.html Evil? Damn evil! Love it! Thank you, Malcolm -- https://mail.python.org/mailman/listinfo/python-list
Re: Caller's module name, function/method name and line number for output to logging
Hi Terry, >> Is there a technique for accessing a function's *CALLER* module name, >> function/method name and line number so that this information can be > Look in the inspect module for the inspect stack function. Note that > when you call the function, it needs to look 2 levels up. Perfect! That's exactly what I was looking for. Thank you, Malcolm -- https://mail.python.org/mailman/listinfo/python-list
Caller's module name, function/method name and line number for output to logging
Is there a technique for accessing a function's *CALLER* module name, function/method name and line number so that this information can be passed to a logging library's logger? I have a routine that detects an error condition, but I want to report the error's context relative to the caller, not the current function. TD;LR: I want to peek 1 level up the call stack for this information. A bonus would be a way to have my logger use info vs the current %(module)s, %(funcName)s, and %(lineno)d values. Thank you, Malcolm -- https://mail.python.org/mailman/listinfo/python-list
Re: Dynamically call methods where method is known by address vs name
Hi Michael, > Out[3]: 'HELLO' > In [4]: g = str.upper > In [5]: g(s) > Out[5]: 'HELLO' That's perfect! My mistake was trying to use the method returned by ''.upper vs. str.upper. Thank you, Malcolm -- https://mail.python.org/mailman/listinfo/python-list
Dynamically call methods where method is known by address vs name
I know I can do the following: >>> s = 'lower' >>> getattr(s, 'upper')() 'LOWER' But how could I do the same if I had the method 'address' (better name???) vs. method name? >>> upper_method = s.upper How do I combine this upper_method with string s to execute the method and return 'LOWER'? Thank you, Malcolm -- https://mail.python.org/mailman/listinfo/python-list
Re: Algorithm for sequencing a collection of dependent equations
Hi Tim, > I think that what you're looking for is a topological sort BINGO! That's *exactly* what I was searching for. Thank you very much, Malcolm -- https://mail.python.org/mailman/listinfo/python-list
Algorithm for sequencing a collection of dependent equations
We're working on a DSL (domain specific language) that we translate into a list of tokenized expressions. My challenge is to figure out how to sequence evaluation of these expressions so that we evaluate these expressions in the proper order given that expressions have dependencies on other expressions. We've done the work to determine the full list of tokens associated with each expression (after referencing other expressions) and we've detected expressions that result in loops. Here's an example of expressions and their full list of dependencies: a = b + b + b + c + c > b, c, d, e, s, t, x b = c + d + e > c, d, e, s, t, x c = s + 3 > s, x d = t + 1 > t e = t + 2 > t s = x + 100 > x t = 10 > None x = 1 > None y = 2 > None I'm looking for an algorithm/data structure that will me to start with the least dependent expression (t, x, y) and move through the list of expressions in dependency order ending with the expression with the most dependencies. I imagine that spreadsheets have to perform a similar type of analysis to figure out how to recalculate their cells. Suggestions on algorithms and/or data structures (some form of graph?) to support the above goals? Thank you, Malcolm -- https://mail.python.org/mailman/listinfo/python-list
Re: Max size of Python source code and compiled equivalent
> Heh, great question, and I'm curious too! But one place to get a bit more > info is the standard library. > > rosuav@sikorsky:~/cpython/Lib$ find -name \*.py|xargs ls -lS|head > -rw-r--r-- 1 rosuav rosuav 624122 Jul 17 17:38 ./pydoc_data/topics.py Brilliant! :) Thanks Chris! Malcolm -- https://mail.python.org/mailman/listinfo/python-list
Max size of Python source code and compiled equivalent
We're writing a DSL parser that generates Python code. While the size of our generated code will be small (< 32K), I wanted to re-assure the rest of our team that there are no reasonable code size boundaries that we need to be concerned about. I've searched for Python documentation that covers max Python source (*.py) and compiled file (*.pyc) sizes without success. Any tips on where to look for this information? Background: Python 3.5.1 on Linux. Thank you, Malcolm -- https://mail.python.org/mailman/listinfo/python-list
Re: Technique for safely reloading dynamically generated module
Thank you Chris and Peter. The source file we're generating has one main function (go) with some supporting functions and classes as well. Will there be any problems referencing additional functions or classes defined in the source that gets passed to exec ... as long as references to those functions and classes happen within the generated module? I assume that one downside to the exec() approach is that there is no persistent namespace for this code's functions and classes, eg. after the exec() completes, its namespace disappears and is not available to code that follows? The Python documentation also warns: "Be aware that the return and yield statements may not be used outside of function definitions even within the context of code passed to the exec() function. The return value is None." Malcolm -- https://mail.python.org/mailman/listinfo/python-list
Technique for safely reloading dynamically generated module
We're designing a server application that parses a custom DSL (domain specific language) source file, generates a Python module with the associated logic, and runs the associated code. Since this is a server application, we need to reload the module after each regeneration. Is this process is simple as the following pseudo code or are there other issues we need to be aware of? Are there better techniques for this workflow (eval, compile, etc)? We're working in Python 3.5.1. import importlib # custom_code is the module our code will generate - a version of this # file will always be present # if custom_code.py is missing, a blank version of this file is created # before this step import custom_code while True: # (re)generates custom_code.py visible in sys.path generate_custom_code( source_file ) # reload the module whose source we just generated importlib.reload( custom_code ) # run the main code in generated module custom_code.go() Thank you, Malcolm -- https://mail.python.org/mailman/listinfo/python-list
Re: Validating Entry in tkinter
Peter, > I think it doesn't matter whether you type in text, or insert it with Ctrl+V > or the middle mouse button. The validatecommand handler is always triggered. > I suspect achieving the same effect with Button/KeyPress handlers would > require significantly more work. Thank you! Malcolm -- http://mail.python.org/mailman/listinfo/python-list
Re: IMAP4_SSL, libgmail, GMail and corporate firewall/proxy
Andrea, What type of result do you get trying port 993 ? Malcolm -- http://mail.python.org/mailman/listinfo/python-list
Re: Yet Another MySQL Problem
Tim, > The underscore is a valid variable-name, idiomatically used for "I don't care > about this", often seen in places like tuple assignment: The underscore is also used as an alias for gettext.gettext or gettext.ugettext so you may want to use another variable-name. Malcolm -- http://mail.python.org/mailman/listinfo/python-list
What license/copyright text to include and where to include it when selling a commercial Python based application?
Looking for advice on what Python license and copyright text to include and where to include it when selling a commercial (Windows based) Python based application. By license text and copyrights I am refering to the text on this page: PYTHON SOFTWARE FOUNDATION LICENSE VERSION 2 http://www.python.org/download/releases/2.6.5/license/ By "where to include it" I mean: 1. What Python license text/copyright text should I place in our printed user manual? 2. What Python license text/copyright text should I include in our online documentation? 3. What Python license text/copyright text should I include in product's license text file? 4. What Python license text/copyright text should I include in application's splash screen and about dialog boxes? Thank you, Malcolm -- http://mail.python.org/mailman/listinfo/python-list
Re: Strategy for determing difference between 2 very large dictionaries
Hi Gabriel, > in Python 3.0 keys() behaves as iterkeys() in previous versions, so the above > code is supposed to be written in Python 2.x) I understand. Thank you. > note that dict comprehensions require Python 3.0 I'm relieved to know that I didn't miss that feature in my reading of Python's 2.5/2.6 documentation :) > You might use instead: > > dict((key,(dict1[key],dict2[key])) for key in ...) Excellent. Thank you. Regards, Malcolm - Original message - From: "Gabriel Genellina" To: python-list@python.org Date: Wed, 24 Dec 2008 07:10:16 -0200 Subject: Re: Strategy for determing difference between 2 very large dictionaries En Wed, 24 Dec 2008 06:23:00 -0200, escribió: > Hi Gabriel, > > Thank you very much for your feedback! > >> k1 = set(dict1.iterkeys()) > > I noticed you suggested .iterkeys() vs. .keys(). Is there any advantage > to using an iterator vs. a list as the basis for creating a set? I You've got an excelent explanation from Marc Rintsch. (Note that in Python 3.0 keys() behaves as iterkeys() in previous versions, so the above code is supposed to be written in Python 2.x) >>> can this last step be done via a simple list comprehension? > >> Yes; but isn't a dict comprehension more adequate? >> >> [key: (dict1[key], dict2[key]) for key in common_keys if >> dict1[key]!=dict2[key]} > > Cool!! I'm relatively new to Python and totally missed the ability to > work with dictionary comprehensions. Yes, your dictionary comprehension > technique is much better than the list comprehension approach I was > struggling with. Your dictionary comprehension statement describes > exactly what I wanted to write. This time, note that dict comprehensions require Python 3.0 -- so the code above won't work in Python 2.x. (It's not a good idea to mix both versions in the same post, sorry!) You might use instead: dict((key,(dict1[key],dict2[key])) for key in ...) but it's not as readable. -- Gabriel Genellina -- http://mail.python.org/mailman/listinfo/python-list -- http://mail.python.org/mailman/listinfo/python-list
Re: Strategy for determing difference between 2 very large dictionaries
Hi Roger, By very large dictionary, I mean about 25M items per dictionary. Each item is a simple integer whose value will never exceed 2^15. I populate these dictionaries by parsing very large ASCII text files containing detailed manufacturing events. From each line in my log file I construct one or more keys and increment the numeric values associated with these keys using timing info also extracted from each line. Some of our log files are generated by separate monitoring equipment measuring the same process. In theory, these log files should be identical, but of course they are not. I'm looking for a way to determine the differences between the 2 dictionaries I will create from so-called matching sets of log files. At this point in time, I don't have concerns about memory as I'm running my scripts on a dedicated 64-bit server with 32Gb of RAM (but with budget approval to raise our total RAM to 64Gb if necessary). My main concern is am I applying a reasonably pythonic approach to my problem, eg. am I using appropriate python techniques and data structures? I am also interested in using reasonable techniques that will provide me with the fastest execution time. Thank you for sharing your thoughts with me. Regards, Malcolm - Original message - From: "Roger Binns" To: python-list@python.org Date: Tue, 23 Dec 2008 23:26:49 -0800 Subject: Re: Strategy for determing difference between 2 very large dictionaries -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 pyt...@bdurham.com wrote: > Feedback on my proposed strategies (or better strategies) would be > greatly appreciated. Both strategies will work but I'd recommend the second approach since it uses already tested code written by other people - the chances of it being wrong are far lower than new code. You also neglected to mention what your concerns are or even what "very large" is. Example concerns are memory consumption, cpu consumption, testability, utility of output (eg as a generator getting each result on demand or a single list with complete results). Some people will think a few hundred entries is large. My idea of large is a working set larger than my workstation's 6GB of memory :-) In general the Pythonic approach is: 1 - Get the correct result 2 - Simple code (developer time is precious) 3 - Optimise for your data and environment Step 3 is usually not needed. Roger -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.9 (GNU/Linux) iEYEARECAAYFAklR5DUACgkQmOOfHg372QSuWACgp0xrdpW+NSB6qqCM3oBY2e/I LIEAn080VgNvmEYj47Mm7BtV69J1GwXN =MKLl -END PGP SIGNATURE- -- http://mail.python.org/mailman/listinfo/python-list -- http://mail.python.org/mailman/listinfo/python-list
Re: Can you recommend a book?
My two favorites: - Core Python Programming by Chun - Learning Python by Lutz Malcolm -- http://mail.python.org/mailman/listinfo/python-list
Recommended "from __future__ import" options for Python 2.5.2?
Is there any consensus on what "from __future__ import" options developers should be using in their Python 2.5.2 applications? Is there a consolidated list of "from __future__ import" options to choose from? Thank you, Malcolm -- http://mail.python.org/mailman/listinfo/python-list
Parsing locale specific dates, currency, numbers
The locale module provides the ability to format dates, currency and numbers according to a specific locale. Is there a corresponding module for parsing locale's output to convert locale formatted dates, currency, and numbers back to their native data types on the basis of a specified locale? In other words, a module that will reverse the outputs of locale on a locale specific basis. Thanks! Malcolm -- http://mail.python.org/mailman/listinfo/python-list
Tips for load balancing multiple Python apps on dual/quad core processors?
I'm looking for tips on how to load balance running multiple Python applications in multi-CPU environments. My understanding is that Python applications and their threads are limited to a specific CPU. Background: I have a Python utility that processes email messages. I suspect there's a lot of idle time while this utility waits on a remote email server. I would like to run as many simultaneous copies of this utility as possible without overwhelming the server these utilities are running on. My thought is that I should write a dispatcher that monitors CPU load and launches/cancels multiple instances of my utility with specific job queues to process. Is there a cross-platform way to monitor CPU load? Is there a pre-existing Python module I should be looking at for building (subclassing) my dispatcher? Thanks! Malcolm -- http://mail.python.org/mailman/listinfo/python-list
Re: Python for Palm OS
E-Lo, PalmPython http://c2.com/cgi/wiki?PalmPython I have no experience with Python for Palm OS's. This is just a reference from my personal notes. Malcolm -- http://mail.python.org/mailman/listinfo/python-list
wxPython/wxWidgets ok for production use ? (was Re: Quality assurance in Python projects containing C modules)
Stefan, > My personal experience with wxPython has its ups and downs. Specifically when > it comes to crashes, I wouldn't bet my life on it. (but then, the OP I'm new to Python and getting ready to build a small client based application intended to run on Windows and Linux. I was planning on using wxPython until I saw your comment above. Any suggestions on an alternative Python client-side GUI library (pyQT ?) or tips on where I can find out more about wxPython/wxWidget problems? Thank you, Malcolm -- http://mail.python.org/mailman/listinfo/python-list