[issue28080] Allow reading member names with bogus encodings in zipfile

2022-03-23 Thread Gregory P. Smith
Gregory P. Smith added the comment: Thanks Serhiy! -- resolution: -> fixed stage: patch review -> commit review status: open -> closed ___ Python tracker ___

[issue28080] Allow reading member names with bogus encodings in zipfile

2022-03-23 Thread Stephen J. Turnbull
Stephen J. Turnbull added the comment: I'm not going to have time to look at the PR for a couple days. I don't understand what the use case is for writing or appending with filenames in a non-UTF-8 encoding. At least in my experience, reading such files is rare, but I have never been asked

[issue28080] Allow reading member names with bogus encodings in zipfile

2022-03-22 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: New changeset a25a985535ccbb7df8caddc0017550ff4eae5855 by Serhiy Storchaka in branch 'main': bpo-28080: Add support for the fallback encoding in ZIP files (GH-32007) https://github.com/python/cpython/commit/a25a985535ccbb7df8caddc0017550ff4eae5855

[issue28080] Allow reading member names with bogus encodings in zipfile

2022-03-22 Thread Gregory P. Smith
Gregory P. Smith added the comment: Your PR looks good to me. I agree with not making it easy to _write_ zipfiles with non-standard encoding used for names. There is a possibility that someone wants that ability when writing zip files (not yet clear) in https://bugs.python.org/issue40172.

[issue28080] Allow reading member names with bogus encodings in zipfile

2022-03-20 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: I experimented with this a lot. There is a problem with the append mode. We can read in the append mode, therefore we need an encoding. But when we close a ZipFile after appending, non-ASCII file names will be encoded in UTF-8 in the central directory.

[issue28080] Allow reading member names with bogus encodings in zipfile

2022-03-20 Thread Serhiy Storchaka
Change by Serhiy Storchaka : -- pull_requests: +30095 pull_request: https://github.com/python/cpython/pull/32007 ___ Python tracker ___

[issue28080] Allow reading member names with bogus encodings in zipfile

2016-12-27 Thread Stephen J. Turnbull
Stephen J. Turnbull added the comment: Thanks for followup! I was just about to write you, now that 3.6 is out. Season's Greetings! First, how do you propose to proceed with issue28115 ("use argparse for the ZipFile module")? If you expect to commit that first (I'm in no hurry for this

[issue28080] Allow reading member names with bogus encodings in zipfile

2016-12-26 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: A ZipFile can be read when open in not read-only mode. Thus the encoding argument should be accepted when mode != 'r'. It would be weird to read file names and write new entries with different encodings. Thus the encoding argument should affect output

[issue28080] Allow reading member names with bogus encodings in zipfile

2016-12-26 Thread INADA Naoki
Changes by INADA Naoki : -- nosy: +inada.naoki ___ Python tracker ___ ___

[issue28080] Allow reading member names with bogus encodings in zipfile

2016-09-12 Thread Stephen J. Turnbull
Stephen J. Turnbull added the comment: Cleaned up a few loose ends while it's all fresh in mind. Will ping python-dev in 4-6 weeks for review for 3.7. Thanks to Serhiy for review. The current version of the patch is much improved over the initial submission due to his efforts. --

[issue28080] Allow reading member names with bogus encodings in zipfile

2016-09-12 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: Python is programming language, I don't understand what you mean saying "available to nonprogrammers". As a programmer you can recode ZipInfo name before outputting or what you want to do with it: filename = filename.encode('cp437').decode(encoding) In

[issue28080] Allow reading member names with bogus encodings in zipfile

2016-09-12 Thread Stephen J. Turnbull
Stephen J. Turnbull added the comment: If you have a workaround that's available to nonprogrammers, I'd like to hear about it. I have found none, that's why I went to the trouble to put together a patch even though I knew that the odds of actually getting it in to Python 3.6 was very low --

[issue28080] Allow reading member names with bogus encodings in zipfile

2016-09-12 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: Sorry, I can't push the patch in a haste. In needs more design discussion, comparing with other implementations (I found only that ZipFile in Java can take the charset argument, but "charset" is common name for text encoding in Java), wider discussion. The

[issue28080] Allow reading member names with bogus encodings in zipfile

2016-09-11 Thread Stephen J. Turnbull
Stephen J. Turnbull added the comment: Can't reply on Rietveld? Lost 2 hours work! Patch updated (encoded-member-names-v2), most changes accepted. Not happy about name change or default to cp437, I want this API to be hard to use and not be part of the normal process (utf-8 or cp437).

[issue28080] Allow reading member names with bogus encodings in zipfile

2016-09-11 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: Added comments on Rietveld. Maybe I'll commit rewritten patch tomorrow before 12:00 UTC (oh, already today!). -- ___ Python tracker

[issue28080] Allow reading member names with bogus encodings in zipfile

2016-09-11 Thread Stephen J. Turnbull
Stephen J. Turnbull added the comment: Re: wait for 3.7 if reviewers are busy, understood. N.B. Contributor agreement is now on file (I received the PDF from python.org already). Re: existing patches: My patch is very similar in the basic approach to Sergey Dorofeev's patch in issue10614.

[issue28080] Allow reading member names with bogus encodings in zipfile

2016-09-11 Thread Ned Deily
Ned Deily added the comment: Stephen, thanks for submitting the patch. Unless another core developer has time to review and commit this prior to the feature code off tomorrow, this will probably need to wait for 3.7. Also, at the moment, your tracker user record does not indicate that there

[issue28080] Allow reading member names with bogus encodings in zipfile

2016-09-11 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: This issue is a duplicate of issue10614. But proposed patch looks more ready. -- assignee: -> serhiy.storchaka stage: -> patch review ___ Python tracker

[issue28080] Allow reading member names with bogus encodings in zipfile

2016-09-11 Thread Stephen J. Turnbull
Stephen J. Turnbull added the comment: I should have a contributor agreement form on file. Ned Deily suggested that I try to get this patch in before the 12 noon deadline Sept. 12, so here it is. I believe the patch is "safe" in the sense that its functionality needs to be explicitly

[issue28080] Allow reading member names with bogus encodings in zipfile

2016-09-11 Thread Stephen J. Turnbull
Stephen J. Turnbull added the comment: Suggested NEWS/whatsnew entry: Add a new *memberNameEncoding* argument to the ZipFile constructor, allowing :mod:`zipfile` to read filenames in non-conforming encodings from the zipfile as Unicode. This implementation assumes all member names have the

[issue28080] Allow reading member names with bogus encodings in zipfile

2016-09-11 Thread Serhiy Storchaka
New submission from Serhiy Storchaka: Could you please provide more information sjt? -- nosy: +serhiy.storchaka status: open -> pending ___ Python tracker

[issue28080] Allow reading member names with bogus encodings in zipfile

2016-09-11 Thread Stephen J. Turnbull
Changes by Stephen J. Turnbull : -- components: Library (Lib) keywords: patch nosy: sjt priority: normal severity: normal status: open title: Allow reading member names with bogus encodings in zipfile type: enhancement versions: Python 3.6