Re: [Tutor] When is and isn't "__file__" set?
On Wed, Jan 10, 2018 at 08:02:24PM -0600, boB Stepp wrote: > I am still puzzling over things from the thread, "Why does > os.path.realpath('test_main.py') give different results for unittest > than for testing statement in interpreter?" The basic question I am > trying to answer is how to determine the path to a particular module > that is being run. For the experiments I have run thus far, the > module attribute, "__file__", has so far reliably given me the > absolute path to the module being run. But the documentation suggests > that this attribute is optional. So what can I rely on here with > "__file__"? The first sentence of the cited quote is not illuminating > this sufficiently for me. Modules which are loaded from a .py or .pyc file on disk should always have __file__ set. If they don't, that's a bug in the interpreter. Modules which are loaded from a .dll or .so binary file also should have __file__ set. Modules that you create on the fly like this: py> from types import ModuleType py> module = ModuleType('module') py> module.__file__ Traceback (most recent call last): File "", line 1, in AttributeError: module 'module' has no attribute '__file__' will not have __file__ set unless you manually set it yourself. Such hand-made modules can be stored in databases and retrieved later, in which case they still won't have a __file__ attribute. Module objects which are built into the interpreter itself, like sys, also won't have a __file__ attribute: py> sys.__file__ Traceback (most recent call last): File "", line 1, in AttributeError: module 'sys' has no attribute '__file__' One tricky, but unusual case, is that Python supports importing and running modules loaded from zip files. Very few people know this feature even exists -- it is one of Python's best kept secrets -- and even fewer know how it works. I'm not sure what happens when you load a module from a zip file, whether it will have a __file__ or not. Basically, if you still to reading module.__file__ for modules which come from a .py file, you should be absolutely fine. But to practice defensive programming, something like: try: path = module.__file__ except AttributeError: print('handle the case where the module doesn't exist on disk') else: print('handle the case where the module does exist on disk') might be appropriate. -- Steve ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
[Tutor] When is and isn't "__file__" set?
I am actually interested in the answer to this question for Python versions 2.4, 2.6 and 3.x. At https://docs.python.org/3/reference/import.html?highlight=__file__#__file__ it says: __file__ is optional. If set, this attribute’s value must be a string. The import system may opt to leave __file__ unset if it has no semantic meaning (e.g. a module loaded from a database). If __file__ is set, it may also be appropriate to set the __cached__ attribute which is the path to any compiled version of the code (e.g. byte-compiled file). The file does not need to exist to set this attribute; the path can simply point to where the compiled file would exist (see PEP 3147). It is also appropriate to set __cached__ when __file__ is not set. However, that scenario is quite atypical. Ultimately, the loader is what makes use of __file__ and/or __cached__. So if a loader can load from a cached module but otherwise does not load from a file, that atypical scenario may be appropriate. I am still puzzling over things from the thread, "Why does os.path.realpath('test_main.py') give different results for unittest than for testing statement in interpreter?" The basic question I am trying to answer is how to determine the path to a particular module that is being run. For the experiments I have run thus far, the module attribute, "__file__", has so far reliably given me the absolute path to the module being run. But the documentation suggests that this attribute is optional. So what can I rely on here with "__file__"? The first sentence of the cited quote is not illuminating this sufficiently for me. -- boB ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] delete strings from specificed words
On 09Jan2018 22:20, YU Bowrote: The text i will working as follow: ```text [...] diff --git a/tools/perf/util/util.c b/tools/perf/util/util.c index a789f952b3e9..443892dabedb 100644 [...] +++ b/tools/perf/util/util.c [...] ``` In fact, this is a patch from lkml,my goal is to design a kernel podcast for myself to focus on what happened in kernel. I have crawled the text with python and want to remove strings from *diff --git*, because reading the git commit above, i have a shape in head. I have tried split(), replace(), but i have no idea to deal with it. Do you have the text as above - a single string - or coming from a file? I'll presume a single string. I would treat the text as lines, particularly since the diff markers etc are all line oriented. So you might write something like this: interesting = [] for line in the_text.splitlines(): if line.startswith('diff --git '): break interesting.append(line) Now the "interesting" list has the lines you want. There's any number of variations on that you might use, but that should get you going. Cheers, Cameron Simpson ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Why does os.path.realpath('test_main.py') give different results for unittest than for testing statement in interpreter?
On 10/01/18 20:20, eryk sun wrote: > ... And working with COM via ctypes is also complex, which is why > comtypes exists. Or easier still Pythonwin (aka PyWin32). I far prefer pythonwin over ctypes for any kind of COM work. -- Alan G Author of the Learn to Program web site http://www.alan-g.me.uk/ http://www.amazon.com/author/alan_gauld Follow my photo-blog on Flickr at: http://www.flickr.com/photos/alangauldphotos ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Why does os.path.realpath('test_main.py') give different results for unittest than for testing statement in interpreter?
On Wed, Jan 10, 2018 at 12:59 PM, Albert-Jan Roskamwrote: > > I tried: from os.path import _getfullpathname _getfullpathname(r"H:") > 'h:\\path\\to\\folder' import os os.getcwd() > 'h:\\path\\to\\folder' > > I expected h:\ to be \\server\share\foo. You called _getfullpathname (WinAPI GetFullPathName), not _getfinalpathname (WinAPI GetFinalPathNameByHandle). GetFullPathName works on the path as a string without touching the filesystem. GetFinalPathNameByHandle reconstructs a path when given a handle to a file or directory. > The fact that the current working directory was returned was even more > unexpected. "H:" or "H:relative/path" is relative to the working directory on drive H:. The process only has one working directory, but GetFullPathName also checks for environment variables such as "=H:". The C runtime's _chdir function sets these magic variables, as does Python's os.chdir function (we don't call C _chdir). WinAPI SetCurrentDirectory does not set them. For example: >>> os.chdir('Z:/Temp') >>> win32api.GetEnvironmentVariable('=Z:') 'Z:\\Temp' >>> os.path._getfullpathname('Z:relative') 'Z:\\Temp\\relative' > Why would anybody *want* the drive letters? They are only useful because (a) > they save on > keystrokes (b) they bypass the annoying limitation of the cd command on > windows, ie. it > does not work with UNC paths. Windows itself has no problem using a UNC path as the working directory. That's a limit of the CMD shell. SUBST drives can be used to access long paths. Since the substitution occurs in the kernel, it avoids the MAX_PATH 260-character limit. Of course, you can also use junctions and symlinks to access long paths, or in Windows 10 simply enable long-path support. > I know net use, pushd, subst. I use 'net use' for more or less permanent > drives and > pushd/popd to get a temporary drive, available letter (cd nuisance). `net.exe use` and CMD's PUSHD command (with a UNC path) both call WinAPI WNetAddConnection2 to create a mapped network drive. The difference is that net.exe can supply alternate credentials and create a persistent mapping, while PUSHD uses the current user's credentials and creates a non-persistent mapping. If your account gets logged on with a UAC split token, the standard and elevated tokens actually have separate logon sessions with separate local-device mappings. You can enable a policy to link the two logon sessions. Set a DWORD value of 1 named "EnableLinkedConnections" in the key "HKLM\Software\Microsoft\Windows\CurrentVersion\Policies\System", and reboot. subst.exe creates substitute paths using WinAPI DefineDosDevice. Unlike the WNet API, this function doesn't use MPR (multiple provider router) to create a direct link for the network provider (e.g. \Device\LanmanRedirectory); doesn't create a linked connection when EnableLinkedConnections is defined; and can't create a persistent drive with stored credentials (though you can use an account logon script for this). On the plus side, a drive mapped via subst.exe can target any path. > Interesting code! I have used the following, which uses SHGetFolderPath, ie. > without 'Known'. > from win32com.shell import shell, shellcon > desktop = shell.SHGetFolderPath(0, shellcon.CSIDL_DESKTOP, 0, 0) SHGetFolderPath is usually fine, but still, it's outdated and deprecated. win32com.shell doesn't wrap SHGetKnownFolderPath for some reason, but you can still use the new known-folder API without ctypes. Just create a KnownFolderManager instance. For example: import pythoncom from win32com.shell import shell kfmgr = pythoncom.CoCreateInstance(shell.CLSID_KnownFolderManager, None, pythoncom.CLSCTX_INPROC_SERVER, shell.IID_IKnownFolderManager) desktop_path = kfmgr.GetFolder(shell.FOLDERID_Desktop).GetPath() This doesn't work as conveniently for getting known folders of other users. While the high-level SHGetKnownFolderPath function takes care of loading the user profile and impersonating, we have to do this ourselves when using a KnownFolderManager instance. That said, to correct my previous post, you have to be logged on with SeTcbPrivilege access (e.g. a SYSTEM service) to get and set other users' known folders without their password. (If you have the password you can use a regular logon instead of an S4U logon, and that works fine.) > Working with ctypes.wintypes is quite complex! I wouldn't say ctypes is complex in general. But calling LsaLogonUser is complex due to all of the structs that include variable-sized buffers. And working with COM via ctypes is also complex, which is why comtypes exists. ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] question about metaclasses
Albert-Jan Roskam wrote: > Why does following the line (in #3) > # 3- > class Meta(type): > def __new__(cls, name, bases, attrs): > for attr, obj in attrs.items(): > if attr.startswith('_'): > continue > elif not isinstance(obj, property): > import pdb;pdb.set_trace() > #setattr(cls, attr, property(lambda self: obj)) # > #incorrect! > raise ValueError("Only properties allowed") > return super().__new__(cls, name, bases, attrs) > > class MyReadOnlyConst(metaclass=Meta): > __metaclass__ = Meta > YES = property(lambda self: 1) > NO = property(lambda self: 0) > DUNNO = property(lambda self: 42) > THROWS_ERROR = 666 > > > c2 = MyReadOnlyConst() > print(c2.THROWS_ERROR) > #c2.THROWS_ERROR = 777 > #print(c2.THROWS_ERROR) > not convert the normal attribute into > a property? > > setattr(cls, attr, property(lambda self: obj)) # incorrect! cls is Meta itself, not MyReadOnlyConst (which is an instance of Meta). When the code in Meta.__new__() executes MyReadOnlyConst does not yet exist, but future attributes are already there, in the form of the attrs dict. Thus to convert the integer value into a read-only property you can manipulate that dict (or the return value of super().__new__()): class Meta(type): def __new__(cls, name, bases, attrs): for attr, obj in attrs.items(): if attr.startswith('_'): continue elif not isinstance(obj, property): attrs[attr] = property(lambda self, obj=obj: obj) return super().__new__(cls, name, bases, attrs) class MyReadOnlyConst(metaclass=Meta): YES = property(lambda self: 1) NO = property(lambda self: 0) DUNNO = property(lambda self: 42) THROWS_ERROR = 666 c = MyReadOnlyConst() try: c.THROWS_ERROR = 42 except AttributeError: pass else: assert False assert c.THROWS_ERROR == 666 PS: If you don't remember why the obj=obj is necessary: Python uses late binding; without that trick all lambda functions would return the value bound to the obj name when the for loop has completed. A simplified example: >>> fs = [lambda: x for x in "abc"] >>> fs[0](), fs[1](), fs[2]() ('c', 'c', 'c') >>> fs = [lambda x=x: x for x in "abc"] >>> fs[0](), fs[1](), fs[2]() ('a', 'b', 'c') ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] question about metaclasses
On Wed, Jan 10, 2018 at 10:08 AM, Albert-Jan Roskamwrote: > Hi, > > > In another thread on this list I was reminded of types.SimpleNamespace. This > is nice, but I wanted to create a bag class with constants that are > read-only. My main question is about example #3 below (example #2 just > illustrates my thought process). Is this a use case to a metaclass? Or can I > do it some other way (maybe a class decorator?). I would like to create a > metaclass that converts any non-special attributes (=not starting with '_') > into properties, if needed. That way I can specify my bag class in a very > clean way: I only specify the metaclass, and I list the attributes as normal > attrbutes, because the metaclass will convert them into properties. You appear to be reimplementing Enum. > Why does following the line (in #3) not convert the normal attribute into a > property? > > setattr(cls, attr, property(lambda self: obj)) # incorrect! Because `cls` is `Meta`, not `MyReadOnlyConst`; `__new__` is implicitly a classmethod and `MyReadOnlyConst` doesn't actually exist yet. When `MyReadOnlyConst` is created by `type.__new__` it will be filled with the contents of `attrs`, so instead of `setattr` you want `attrs[attr] = property(...)`. But once you're past the learning exercise that this is, just use enum.Enum or collections.namedtuple :) -- Zach ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] question about metaclasses
On Wed, Jan 10, 2018 at 04:08:04PM +, Albert-Jan Roskam wrote: > In another thread on this list I was reminded of > types.SimpleNamespace. This is nice, but I wanted to create a bag > class with constants that are read-only. If you expect to specify the names of the constants ahead of time, the best solution is (I think) a namedtuple. from collections import namedtuple Bag = namedtuple('Bag', 'yes no dunno') a = Bag(yes=1, no=0, dunno=42) b = Bag(yes='okay', no='no way', dunno='not a clue') ought to do what you want. Don't make the mistake of doing this: from collections import namedtuple a = namedtuple('Bag', 'yes no dunno')(yes=1, no=0, dunno=42) b = namedtuple('Bag', 'yes no dunno')(yes='okay', no='no way', dunno='not a clue') because that's quite wasteful of memory: each of a and b belong to a separate hidden class, and classes are rather largish objects. If you expect to be able to add new items on the fly, but have them read-only once set, that's a different story. -- Steve ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
[Tutor] question about metaclasses
Hi, In another thread on this list I was reminded of types.SimpleNamespace. This is nice, but I wanted to create a bag class with constants that are read-only. My main question is about example #3 below (example #2 just illustrates my thought process). Is this a use case to a metaclass? Or can I do it some other way (maybe a class decorator?). I would like to create a metaclass that converts any non-special attributes (=not starting with '_') into properties, if needed. That way I can specify my bag class in a very clean way: I only specify the metaclass, and I list the attributes as normal attrbutes, because the metaclass will convert them into properties. Why does following the line (in #3) not convert the normal attribute into a property? setattr(cls, attr, property(lambda self: obj)) # incorrect! # 1- # nice, but I want the constants to be read-only from types import SimpleNamespace const = SimpleNamespace(YES=1, NO=0, DUNNO=9) const.YES = 0 print(const) # 2- # works, but I wonder if there's a builtin way class Const(object): """Adding attributes is ok, modifying them is not""" YES = property(lambda self: 1) NO = property(lambda self: 0) DUNNO = property(lambda self: 42) #THROWS_ERROR = 666 def __new__(cls): for attr in dir(cls): if attr.startswith('_'): continue elif not isinstance(getattr(cls, attr), property): raise ValueError("Only properties allowed") return super().__new__(cls) def __repr__(self): kv = ["%s=%s" % (attr, getattr(self, attr)) for \ attr in sorted(dir(Const)) if not attr.startswith('_')] return "ReadOnlyNamespace(" + ", ".join(kv) + ")" c = Const() print(repr(c)) #c.YES = 42 # raises AttributeError (desired behavior) print(c.YES) # 3- class Meta(type): def __new__(cls, name, bases, attrs): for attr, obj in attrs.items(): if attr.startswith('_'): continue elif not isinstance(obj, property): import pdb;pdb.set_trace() #setattr(cls, attr, property(lambda self: obj)) # incorrect! raise ValueError("Only properties allowed") return super().__new__(cls, name, bases, attrs) class MyReadOnlyConst(metaclass=Meta): __metaclass__ = Meta YES = property(lambda self: 1) NO = property(lambda self: 0) DUNNO = property(lambda self: 42) THROWS_ERROR = 666 c2 = MyReadOnlyConst() print(c2.THROWS_ERROR) #c2.THROWS_ERROR = 777 #print(c2.THROWS_ERROR) Thank you in advance and sorry about the large amount of code! Albert-Jan ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Why does os.path.realpath('test_main.py') give different results for unittest than for testing statement in interpreter?
From: eryk sunSent: Wednesday, January 10, 2018 3:56 AM To: tutor@python.org Cc: Albert-Jan Roskam Subject: Re: [Tutor] Why does os.path.realpath('test_main.py') give different results for unittest than for testing statement in interpreter? On Tue, Jan 9, 2018 at 2:48 PM, Albert-Jan Roskam wrote: > >> I think that it would be a great enhancement if os.realpath would return the >> UNC path if >> given a mapped drive in Windows, if needed as extended path (prefixed with >> "\\?\UNC\"). >> That's something I use all the time, unlike symlinks, in Windows. > >pathlib can do this for you, or call os.path._getfinalpathname. I tried: >>> from os.path import _getfullpathname >>> _getfullpathname(r"H:") 'h:\\path\\to\\folder' >>> import os >>> os.getcwd() 'h:\\path\\to\\folder' I expected h:\ to be \\server\share\foo. The fact that the current working directory was returned was even more unexpected. >>I recently helped someone that wanted the reverse, to map the resolved >>UNC path back to a logical drive: Oh dear. Why would anybody *want* the drive letters? They are only useful because (a) they save on keystrokes (b) they bypass the annoying limitation of the cd command on windows, ie. it does not work with UNC paths. Driveletter-->UNC conversion is useful when e.g. logging file paths. I do wonder whether the method used to assign the drive letter matters with the . I know net use, pushd, subst. I use 'net use' for more or less permanent drives and pushd/popd to get a temporary drive, available letter (cd nuisance). >We can't assume in general that a user's special folders (e.g. >Desktop, Documents) are in the default location relative to the >profile directory. Almost all of them are relocatable. There are shell >APIs to look up the current locations, such as SHGetKnownFolderPath. >This can be called with ctypes [1]. For users other than the current >user, it requires logging on and impersonating the user, which >requires administrator access. > >[1]: https://stackoverflow.com/a/33181421/205580 Interesting code! I have used the following, which uses SHGetFolderPath, ie. without 'Known'. from win32com.shell import shell, shellcon desktop = shell.SHGetFolderPath(0, shellcon.CSIDL_DESKTOP, 0, 0) Working with ctypes.wintypes is quite complex! ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] delete strings from specificed words
Hi, On Wed, Jan 10, 2018 at 10:37:09AM +0100, Peter Otten wrote: YU Bo wrote: index 45a63e0..3b9b238 100644 ... ``` I want to delete string from *diff --git* to end, because too many code is here Use str.split() or str.partition() and only keep the first part: text = """The registers rax, rcx and rdx are touched when controlling IBRS ... so they need to be saved when they can't be clobbered. ... ... diff --git a/arch/x86/entry/calling.h b/arch/x86/entry/calling.h ... index 45a63e0..3b9b238 100644 ... """ cleaned_text = text.partition("diff --git")[0].strip() print(cleaned_text) The registers rax, rcx and rdx are touched when controlling IBRS so they need to be saved when they can't be clobbered. Cool,It is what i want. Thanks all! Bo ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] delete strings from specificed words
YU Bo wrote: > ```text > The registers rax, rcx and rdx are touched when controlling IBRS > so they need to be saved when they can't be clobbered. > > diff --git a/arch/x86/entry/calling.h b/arch/x86/entry/calling.h > index 45a63e0..3b9b238 100644 > ... > ``` > I want to delete string from *diff --git* to end, because too many code is > here Use str.split() or str.partition() and only keep the first part: >>> text = """The registers rax, rcx and rdx are touched when controlling IBRS ... so they need to be saved when they can't be clobbered. ... ... diff --git a/arch/x86/entry/calling.h b/arch/x86/entry/calling.h ... index 45a63e0..3b9b238 100644 ... """ >>> cleaned_text = text.partition("diff --git")[0].strip() >>> print(cleaned_text) The registers rax, rcx and rdx are touched when controlling IBRS so they need to be saved when they can't be clobbered. ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] delete strings from specificed words
Hi, First, thank you very much for your reply. On Tue, Jan 09, 2018 at 10:25:11PM +, Alan Gauld via Tutor wrote: On 09/01/18 14:20, YU Bo wrote: But, i am facing an interesting question.I have no idea to deal with it. I don;t think you have given us enough context to be able to help much. WE would need some idea of the input and output data (both current and desired) It sounds like you are building some kind of pretty printer. Maybe you could use Pythons pretty-print module as a design template? Or maybe even use some of it directly. It just depends on your data formats etc. Yes. I think python can deal with it directly. In fact, this is a patch from lkml,my goal is to design a kernel podcast for myself to focus on what happened in kernel. Sorry, I've no idea what lkml is nor what kernel you are talking about. Can you show us what you are receiving, what you are currently producing and what you are trying to produce? Some actual code might be an idea too. And the python version and OS. Sorry, i don't to explain it.But, my code is terribly. lkml.py: ```code #!/usr/bin/python # -*- coding: UTF-8 -*- # File Name: lkml.py # Author: Bo Yu """ This is source code in page that i want to get """ import sys reload(sys) sys.setdefaultencoding('utf8') import urllib2 from bs4 import BeautifulSoup import requests import chardet import re # import myself print function from get_content import print_content if __name__ == '__main__': comment_url = [] target = 'https://www.spinics.net/lists/kernel/threads.html' req = requests.get(url=target) req.encoding = 'utf-8' content = req.text bf = BeautifulSoup(content ,'lxml') # There is no problem context = bf.find_all('strong') for ret in context[0:1]: for test in ret: print '\t' x = re.split(' ', str(test)) y = re.search('"(.+?)"', str(x)).group(1) comment_url.append(target.replace("threads.html", str(y))) for tmp_target in comment_url: print "===This is a new file ===" print_content(tmp_target, 'utf-8', 'title') ``` get_content.py: ```code #!/usr/bin/python # -*- coding: UTF-8 -*- # File Name: get_content.py import urllib2 from bs4 import BeautifulSoup import requests import chardet import re def print_content(url, charset, find_id): req = requests.get(url=url) req.encoding = charset content = req.text bf = BeautifulSoup(content ,'lxml') article_title = bf.find('h1') #author = bf.find_all('li') commit = bf.find('pre') print '\t' print article_title.get_text() print '\t' x = str(commit.get_text()) print x ``` python --version: Python 2.7.13 OS: debian 9 usage: python lkml.py output: oh... https://pastecode.xyz/view/04645424 Please ignore my print debug format. This is my code and i can get text like output above. So, simple my quzz: I dont know how to delete strings after special word, for example: ```text The registers rax, rcx and rdx are touched when controlling IBRS so they need to be saved when they can't be clobbered. diff --git a/arch/x86/entry/calling.h b/arch/x86/entry/calling.h index 45a63e0..3b9b238 100644 ... ``` I want to delete string from *diff --git* to end, because too many code is here Whatever, thanks! -- Alan G Author of the Learn to Program web site http://www.alan-g.me.uk/ http://www.amazon.com/author/alan_gauld Follow my photo-blog on Flickr at: http://www.flickr.com/photos/alangauldphotos ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor