Another possibility would be to use a lambda function or a callable object. This adds an overhead but would also allow you to inject new parameters that go into the function call. It also does not require any extra import.
obj.old_method_name = lambda *a, **kw: new_method_name(obj, *a, **kw) A full example goes like this: class C: def __init__(self): self.value = 21 def get(self): return self.value def new_get(self): return self.value * 2 obj = C() print(obj.get()) obj.get = lambda *a, **kw: new_get(obj, *a, **kw) print(obj.get()) This would first output 21 and then 42. -- What you are trying to do requires more than just replacing the function _convert_cell. By default, OpenpyxlReader loads the workbook in read_only mode, discarding all links. This means that the cell object present in _convert_cell has no hyperlink attribute. There is no option to make it load the links. To force it to be loaded, we need to replace load_workbook as well. This method asks openpyxl to load the workbook, deciding whether it will discard the links or not. The second problem is that as soon as you instantiate an ExcelFile object it will instantiate an OpenpyxlReader and load the file. Leaving you with no time to replace the functions. Happily, ExcelFile gets the engine class from a static dictionary called _engines. This means that we can extend OpenpyxlReader, overwrite those two methods and replace the reference in ExcelFile._engines. The full source is: import pandas as pd class MyOpenpyxlReader(pd.ExcelFile.OpenpyxlReader): def load_workbook(self, filepath_or_buffer): from openpyxl import load_workbook return load_workbook( filepath_or_buffer, read_only=False, data_only=False, keep_links=True ) def _convert_cell(self, cell, convert_float: bool): value = super()._convert_cell(cell, convert_float) if cell.hyperlink is None: return value else: return (value, cell.hyperlink.target) pd.ExcelFile._engines["openpyxl"] = MyOpenpyxlReader df = pd.read_excel("links.xlsx") print(df) The source above worked on python 3.8.10, pandas 1.5.0, and openpyxl 3.0.10. The output for a sample xlsx file with the columns id, a page name (with links), and the last access is shown next. The first element in the second column's output tuple is the cell's text and the second element is the cell's link: id page last access 0 1 (google, https://www.google.com/) 2022-04-12 1 2 (gmail, https://gmail.com/) 2022-02-06 2 3 (maps, https://www.google.com/maps) 2022-02-17 3 4 (bbc, https://bbc.co.uk/) 2022-08-30 4 5 (reddit, https://www.reddit.com/) 2022-12-02 5 6 (stackoverflow, https://stackoverflow.com/) 2022-05-25 -- Should you do any of this? No. 1. What makes a good developer is his ability to create clear and maintainable code. Any of these options are clearly not clear, increase cognitive complexity, and reduce reliability. 2. We are manipulating internal class attributes and internal methods (those starting with _). Internal elements are not guaranteed to stay there over different versions, even minor updates. You should not manipulate them unless you are working on a fixed library version, like implementing tests and checking if the internal state has changed, hacking it, or debugging. Python assumes you will access these attributes wisely. 3. If you are working with other developers and you commit this code there is a huge chance another developer is using a slightly different pandas version that misses one of these elements. You will break the build, your team will complain and start thinking you are a naive developer. 4. Even if you adapt your code for multiple pandas versions you will end up with multiple ifs and different implementations. You don't want to maintain this over time. 5. It clearly takes more time to understand pandas' internals than writing your reader using openpyxl. It is not cumbersome, and if it changes the execution time from 20ms to 40ms but is much more reliable and maintainable we surely prefer the latter. The only scenario I see in which this would be acceptable is when you or your boss have an important presentation in the next hour, and you need a quick fix to make it work in order to demonstrate it. After the presentation is over and people have validated the functionality you should properly implement it. Keep It Simple and Stupid (KISS) -- Diego Souza Wespa Intelligent Systems Rio de Janeiro - Brasil On Mon, Sep 19, 2022 at 1:00 PM <python-list-requ...@python.org> wrote: > > > From: "Weatherby,Gerard" <gweathe...@uchc.edu> > Date: Mon, 19 Sep 2022 13:06:42 +0000 > Subject: Re: How to replace an instance method? > Just subclass and override whatever method you wish to modify > “Private” is conceptual. Mostly it means when the next version of a module > comes out, code that you wrote that accesses *._ parts of the module might > break. > ___ > > > import pandas > > > class MyClass(pandas.ExcelFile.OpenpyxlReader): > > def _convert_cell(self, cell, convert_float: bool) -> 'Scalar': > """override""" > # do whatever you want, or call the base class version > return super()._convert_cell(cell, convert_float) > > — > Gerard Weatherby | Application Architect NMRbox | NAN | Department of > Molecular Biology and Biophysics > UConn Health 263 Farmington Avenue, Farmington, CT 06030-6406 uchc.edu > On Sep 17, 2022, 5:29 PM -0400, Ralf M. <ral...@t-online.de>, wrote: > *** Attention: This is an external email. Use caution responding, opening > attachments or clicking on links. *** > > Am 17.09.2022 um 00:35 schrieb Dan Stromberg: > > > On Fri, Sep 16, 2022 at 2:06 PM Ralf M. <ral...@t-online.de > <mailto:ral...@t-online.de>> wrote: > > I would like to replace a method of an instance, but don't know how to > do it properly. > > > You appear to have a good answer, but... are you sure this is a good idea? > > It's definitely a dirty hack. > > It'll probably be confusing to future maintainers of this code, and I > doubt static analyzers will like it either. > > I agree that I will have to add sufficient comments for the future > maintainer, should there ever be one (and even for me to still > understand it next year). I don't use static analyzers. > > I'm not the biggest fan of inheritance you'll ever meet, but maybe this > is a good place for it? > > Using a derived version of the class in question to overwrite the > method was my first idea, however I don't instantiate the class in > question myself, it is instantiated during the initialisation of > another class, so I would at least have to derive a modified version of > that as well. And that code is rather complex, with metaclasses and > custom decorators, and I feel uncomfortable messing with that, while > the method I intend to change is quite simple and straightforward. > > In case anybody is interested what I'm trying to achieve: > > It's simple in pandas to read an excel file into a dataframe, but only > the cell value is read. Sometimes I need more / other information, e.g. > some formatting or the hyperlink in a cell. Reopening the file with > openpyxl and getting the info is possible, but cumbersome. > Looking into the pandas code for reading excel files (which uses > openpyxl internally) I noticed a method (of an internal pandas class) > that extracts the value from an openpyxl cell. This method is rather > simple and seems the ideal spot to change to get what I want. > > My idea is to instantiate pandas.ExcelFile (official pandas API), get > the reader instance (an attribute of the ExcelFile object) and modify > the method of the reader instance. > > The fact that the method I change and the ExcelFile attribute containing > the reader are both private (start with _) doesn't make it any better, > but I'm desperate enough to be willing to adapt my code to every major > pandas release, if necessary. > > Ralf M. > -- > https://urldefense.com/v3/__https://mail.python.org/mailman/listinfo/python-list__;!!Cn_UX_p3!mYWFkAugwhU4HgCv9nRg1vSJhyJCA8RApcnyGTRNGQYTTmvVigqANAagTbBwo96YFdHmzfCYU8gN3KpVmcrmOg$ -- https://mail.python.org/mailman/listinfo/python-list