Nice. Thanks for jotting down the issues on arm64, python3.10, and such! 
Sounds to me like you can safely ignore/grep out that warning until it's 
fixed upstream. Filing an issue with ofxparse would be good.

Related: 'ofx-summarize' is a command included in beancount-reds-importers 
(bleeding edge) that you can use to inspect any ofx, get a quick and dirty 
summary, and explore via a pdb shell. 

Thanks for the note about asyncio as well. I made the change and cleaned up 
the script. It is also now installed as a command ("bean-download") as a 
part of beancount-reds-importers (bleeding edge only, for now).


On Monday, June 6, 2022 at 10:27:17 AM UTC-7 [email protected] wrote:

> Your question about my platform got me thinking
>
> I setup a new venv using python3.8 (instead of 3.10) and ran without any 
> warnings. Haven't looked into why that might be yet.
>
> Some tangential things I've ran into:
>
>    - your very helpful template/reference script here 
>    <https://gist.github.com/redstreet/68f8ef59e4532f4de2271402238f370a> runs 
>    into a python 3.10 specific deprecation warning mentioned here 
>    
> <https://docs.python.org/3.10/library/asyncio-eventloop.html#asyncio.get_event_loop>.
>  
>    They want you to use get_running_loop() instead of get_event_loop(). More 
>    discussion here <https://bugs.python.org/issue38599>. I'm not asking 
>    for a fix or help here, just sharing
>    - the original reason why I moved to python3.10 is because my platform 
>    is arm64e/macOS. In short, if you are using smart-importer on arm64e with 
>    python 3.8 (or earlier) you'll end up with scikit-learn built for x86 and 
>    you'll be unable to import. There's a lot of talk about a way to get an 
> arm 
>    build of scikit-learn using conda but it's a pain, would not recommend. 
>    Another option is install everything for x86 and use rosetta (e.g. `arch 
>    -x86_64 ./import.sh`). The last option is using python3.10 which appears 
> to 
>    pull in everything you need to run natively with smart-importer
>
> So I think I have two options, use rosetta and x86 for everything with 
> python 3.10 or explore running natively with python 3.10 and getting fixes 
> for the python3.10 specific issues.
> On Sunday, June 5, 2022 at 10:33:18 PM UTC-7 Red S wrote:
>
>> Hmm, I haven't come across this issue so far.
>>
>> It's the ofxparse library <https://github.com/jseutter/ofxparse> that 
>> uses BS4. I'd ask there. Indeed, they did decide 
>> <https://github.com/jseutter/ofxparse/pull/108> to parse this as HTML 
>> even though it's XML, but that code has worked fine for years now. What 
>> platform are you using?
>>
>> I'd also consider filtering out via the shell, if everything else works 
>> fine:
>> bean-extract [blah blah...] 2> >(grep -v XMLParsedAsHTMLWarning >&2)
>>
>>
>> On Sunday, June 5, 2022 at 6:10:35 PM UTC-7 [email protected] wrote:
>>
>>> Hey all,
>>>
>>> I'm getting the following warning:
>>> venv/lib/python3.10/site-packages/bs4/builder/__init__.py:545: 
>>> XMLParsedAsHTMLWarning: It looks like you're parsing an XML document using 
>>> an HTML parser. If this really is an HTML document (maybe it's XHTML?), you 
>>> can ignore or filter this warning. If it's XML, you should know that using 
>>> an XML parser will be more reliable. To parse this document as XML, make 
>>> sure you have the lxml package installed, and pass the keyword argument 
>>> `features="xml"` into the BeautifulSoup constructor.
>>>   warnings.warn(
>>>
>>> What I'm doing to get this:
>>>
>>>    - Downloading account data using ofxget as described here 
>>>    <https://reds-rants.netlify.app/personal-finance/direct-downloads/>
>>>    - Importing that data using beancount-reds-importer (e.g. here 
>>>    
>>> <https://github.com/redstreet/beancount_reds_importers/blob/main/beancount_reds_importers/chase/__init__.py>
>>>    )
>>>
>>> Things I've tried or discovered:
>>>
>>>    - I looked for all instances of `soup = BeautifulSoup .. ` and found 
>>>    the main calls in ofx.py. I tried changing these calls from feature=lxml 
>>> to 
>>>    feature=xml which didn't resolve warning
>>>    - I made sure lxml is downloaded
>>>    - I tried to suppress the warning with a warning.filterwarnings but 
>>>    that didn't work either (not sure it would be the "right" thing either)
>>>    - I found a PR in an unrelated repo where they solved by suppressing 
>>>    here <https://github.com/EnergieID/entsoe-py/issues/180>
>>>    - I tried ofx data downloaded from both Fidelity Investments and 
>>>    Chase (not expecting this to be institution specific)
>>>
>>> Questions I have:
>>>
>>>    - The warning doesn't really help me understand what call into 
>>>    BeautifulSoup caused the warning. Any tips on how to track down where 
>>> the 
>>>    issue is coming from? Maybe ofx.py isn't part of the issue at all
>>>    - I think bean_extract is still working but any suggestions on if 
>>>    the warning should be ignored or resolved would also be appreciated
>>>
>>>
>>>

-- 
You received this message because you are subscribed to the Google Groups 
"Beancount" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/beancount/c00152d9-3213-44db-aa75-c79ba7144d81n%40googlegroups.com.

Reply via email to