Re: [fpc-pascal] TZipper and special file names like "atenção.txt" (#26213)
2014-05-30 7:30 GMT-03:00 Reinier Olislagers : > On 27/05/2014 00:41, silvioprog wrote: > > Perfect! :) > > Please feel free to open that bug report if you haven't already > > Thanks. > Done: http://bugs.freepascal.org/view.php?id=26213 -- Silvio Clécio My public projects - github.com/silvioprog ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Re: [fpc-pascal] TZipper and special file names like "atenção.txt" (#26213)
On 27/05/2014 00:41, silvioprog wrote: > Perfect! :) Please feel free to open that bug report if you haven't already Thanks. ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Re: [fpc-pascal] TZipper and special file names like "atenção.txt" (#26213)
2014-05-24 5:02 GMT-03:00 Reinier Olislagers : > On 24/05/2014 08:33, Michael Van Canneyt wrote: > > On Fri, 23 May 2014, Craig Peterson wrote: > >> The Info-zip project maintains an annotated Appnote that lists a bunch > >> of the extra fields that various vendors use here: > >> http://www.info-zip.org/doc/ > > Strange then that the info-zip so often creates garbled filenames on > > Linux ! > > Probably the used zipper doesn't use the extra fields. > Of course it doesn't use them. It has no unicode support. > > > > Since it is optional, we can probably add a WideString property to the > > zipitem and add an overloaded call; > > Then we don't need to recreate everything, > Recreate everything? Don't get what you mean here. > > >just add the extra fields. > (Yes, IIRC, the zip64 fix added support for extra fields already so it > shouldn't be a big change) > > Agreed when writing. > When reading, I'd suggest following the suggested behaviour in the spec: > 1. Try to read UTF8 filename from the EFS; if not present fall back to > 2. Try to read extra fields filename as implemented by info-zip/abbrevia > etc, if not present fall back to > 3. current behaviour (reading filename as is) Perfect! :) -- Silvio Clécio My public projects - github.com/silvioprog ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Re: [fpc-pascal] TZipper and special file names like "atenção.txt" (#26213)
On 24/05/2014 08:33, Michael Van Canneyt wrote: > On Fri, 23 May 2014, Craig Peterson wrote: >> The Info-zip project maintains an annotated Appnote that lists a bunch >> of the extra fields that various vendors use here: >> http://www.info-zip.org/doc/ > Strange then that the info-zip so often creates garbled filenames on > Linux ! > Probably the used zipper doesn't use the extra fields. Of course it doesn't use them. It has no unicode support. > Since it is optional, we can probably add a WideString property to the > zipitem and add an overloaded call; > Then we don't need to recreate everything, Recreate everything? Don't get what you mean here. >just add the extra fields. (Yes, IIRC, the zip64 fix added support for extra fields already so it shouldn't be a big change) Agreed when writing. When reading, I'd suggest following the suggested behaviour in the spec: 1. Try to read UTF8 filename from the EFS; if not present fall back to 2. Try to read extra fields filename as implemented by info-zip/abbrevia etc, if not present fall back to 3. current behaviour (reading filename as is) ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Re: [fpc-pascal] TZipper and special file names like "atenção.txt" (#26213)
On Fri, 23 May 2014, Craig Peterson wrote: The Info-zip project maintains an annotated Appnote that lists a bunch of the extra fields that various vendors use here: http://www.info-zip.org/doc/ Great, thank you for this information. Nice to see an expert on the list :) Strange then that the info-zip so often creates garbled filenames on Linux ! Probably the used zipper doesn't use the extra fields. If it isn't there, infozip cannot use it obviously. Since it is optional, we can probably add a WideString property to the zipitem and add an overloaded call; Then we don't need to recreate everything, just add the extra fields. Michael. ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Re: [fpc-pascal] TZipper and special file names like "atenção.txt" (#26213)
The Info-zip project maintains an annotated Appnote that lists a bunch of the extra fields that various vendors use here: http://www.info-zip.org/doc/ -- Craig Peterson Scooter Software___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Re: [fpc-pascal] TZipper and special file names like "atenção.txt" (#26213)
On May 23, 2014, at 8:26 PM, silvioprog wrote: > 2014-05-23 15:50 GMT-03:00 Craig Peterson > I have a question. Adding this extended header, can I open/uncompress the zip > file normally in programs like 7z and WinRAR? Yes. The appnote describes the format for the "extra field", which is extensible so multiple records can be stored and applications can skip over any they don't understand. 7-zip doesn't use it, but doesn't have a problem with it. I'm not sure about WinRAR, but WinZip will use the header if it exists. I didn't invent it for Abbrevia; it was originally designed by the Info-zip guys. -- Craig Peterson Scooter Software___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Re: [fpc-pascal] TZipper and special file names like "atenção.txt" (#26213)
2014-05-23 15:50 GMT-03:00 Craig Peterson : > > Nice. I can do it, opening a new issue in bugtracker. > > Filename encoding in zip files is poorly defined. The current > APPNOTE.txt says that the only valid encoding is OEM 437, with UTF-8 if > a bit is set in the header, but those were recent additions, and in > practice Windows applications will generally use either the OEM or ANSI > codepage of the current system locale, and files generated on Unix will > be UTF-8 but won't have the language encoding bit set. > > Abbrevia's zip encoding/decoding tries to handle the issue in as > compatible a manner as possible. It stores the original filenames as > OEM/ANSI based on the current system, and stores a UTF-8 copy in an > extended header so there's a known way to decode it when changing > locales. When reading it has to use lookup tables to guess if the > filenames are likely OEM or ANSI. On Unicode-enabled Delphi releases > it's fully Unicode enabled; on FreePascal and older Delphi releases it > only supports ANSI filenames but still does proper encoding/decoding. > > The relevant code is in AbZipTyp.pas in TAbZipItem.SetFilename and > TAbZipItem.LoadFromStream if you want a reference. It's under the MPL, > but I'm the original author and I'm happy to relicense it if someone > else wants to incorporate the code into paszlib. > > > https://sourceforge.net/p/tpabbrevia/code/HEAD/tree/trunk/source/AbZipTyp.pas > > -- > Craig Peterson > Scooter Software Very nice. I have a question. Adding this extended header, can I open/uncompress the zip file normally in programs like 7z and WinRAR? -- Silvio Clécio My public projects - github.com/silvioprog ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Re: [fpc-pascal] TZipper and special file names like "atenção.txt" (#26213)
> Nice. I can do it, opening a new issue in bugtracker. Filename encoding in zip files is poorly defined. The current APPNOTE.txt says that the only valid encoding is OEM 437, with UTF-8 if a bit is set in the header, but those were recent additions, and in practice Windows applications will generally use either the OEM or ANSI codepage of the current system locale, and files generated on Unix will be UTF-8 but won't have the language encoding bit set. Abbrevia's zip encoding/decoding tries to handle the issue in as compatible a manner as possible. It stores the original filenames as OEM/ANSI based on the current system, and stores a UTF-8 copy in an extended header so there's a known way to decode it when changing locales. When reading it has to use lookup tables to guess if the filenames are likely OEM or ANSI. On Unicode-enabled Delphi releases it's fully Unicode enabled; on FreePascal and older Delphi releases it only supports ANSI filenames but still does proper encoding/decoding. The relevant code is in AbZipTyp.pas in TAbZipItem.SetFilename and TAbZipItem.LoadFromStream if you want a reference. It's under the MPL, but I'm the original author and I'm happy to relicense it if someone else wants to incorporate the code into paszlib. https://sourceforge.net/p/tpabbrevia/code/HEAD/tree/trunk/source/AbZipTyp.pas -- Craig Peterson Scooter Software ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Re: [fpc-pascal] TZipper and special file names like "atenção.txt" (#26213)
2014-05-23 13:28 GMT-03:00 Michael Van Canneyt : > > On Fri, 23 May 2014, Tomas Hajny wrote: > >> On Fri, May 23, 2014 17:38, Reinier Olislagers wrote: >> >>> On 23/05/2014 17:30, Michael Van Canneyt wrote: >>> On Fri, 23 May 2014, silvioprog wrote: >>> >>> I have a program that makes daily backups, and just discovered this > problem when I noticed that it did not compressing files with names > with special characters. > This is not fixable. The zip standard has no rules for encoding filenames. Whatever bytes you put in is what comes out. You are entirely responsible for handling the encoding. >>> >>> Not quite, according to the spec >>> http://www.pkware.com/documents/casestudies >> >> >> I can't access this URL (HTTP 403 - Forbidden). The right URL seems to be >> http://www.pkware.com/documents/casestudies/APPNOTE.TXT. >> > > Indeed. > > A proper implementation would require a serious rewrite of this component > with a unicode API. This can be entered as a feature request if so desired. Nice. I can do it, opening a new issue in bugtracker. Temporarily, I fixed my problem 'underlining' the file name, e.g.: program project1; {$mode objfpc}{$H+} uses zipper, zstream, sysutils, RUtils { https://github.com/silvioprog/rutils/blob/master/src/rutils.pas }; var dir, fil, fn, zip: string; begin dir := 'C:\Silvio Clécio\Cópias de segurança\'; fn := 'instrução-para-desbloqueio-do-PIN.pdf'; fil := dir + fn; zip := dir + RUtils.UnderlineStr(fn + '.zip'); with TZipper.Create do try FileName := {$IFDEF MSWINDOWS}Utf8ToAnsi({$ENDIF}zip{$IFDEF MSWINDOWS}){$ENDIF}; Entries.AddFileEntry( {$IFDEF MSWINDOWS}Utf8ToAnsi({$ENDIF}fil{$IFDEF MSWINDOWS}){$ENDIF}, RUtils.UnderlineStr(fn)).CompressionLevel := clMax; ZipAllFiles; finally Free; end; end. Thanks! -- Silvio Clécio My public projects - github.com/silvioprog ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Re: [fpc-pascal] TZipper and special file names like "atenção.txt" (#26213)
On Fri, 23 May 2014, Tomas Hajny wrote: On Fri, May 23, 2014 17:38, Reinier Olislagers wrote: On 23/05/2014 17:30, Michael Van Canneyt wrote: On Fri, 23 May 2014, silvioprog wrote: . . I have a program that makes daily backups, and just discovered this problem when I noticed that it did not compressing files with names with special characters. This is not fixable. The zip standard has no rules for encoding filenames. Whatever bytes you put in is what comes out. You are entirely responsible for handling the encoding. Not quite, according to the spec http://www.pkware.com/documents/casestudies . . I can't access this URL (HTTP 403 - Forbidden). The right URL seems to be http://www.pkware.com/documents/casestudies/APPNOTE.TXT. Indeed. A proper implementation would require a serious rewrite of this component with a unicode API. This can be entered as a feature request if so desired. Michael. ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Re: [fpc-pascal] TZipper and special file names like "atenção.txt" (#26213)
On Fri, May 23, 2014 17:38, Reinier Olislagers wrote: > On 23/05/2014 17:30, Michael Van Canneyt wrote: >> On Fri, 23 May 2014, silvioprog wrote: . . >>> I have a program that makes daily backups, and just discovered this >>> problem when I noticed that it did not compressing files with names >>> with special characters. >> >> This is not fixable. The zip standard has no rules for encoding >> filenames. >> Whatever bytes you put in is what comes out. >> You are entirely responsible for handling the encoding. > > Not quite, according to the spec > http://www.pkware.com/documents/casestudies . . I can't access this URL (HTTP 403 - Forbidden). The right URL seems to be http://www.pkware.com/documents/casestudies/APPNOTE.TXT. Tomas ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Re: [fpc-pascal] TZipper and special file names like "atenção.txt" (#26213)
On 23/05/2014 17:30, Michael Van Canneyt wrote: > On Fri, 23 May 2014, silvioprog wrote: >> I've tried to compress a small file with TZipper class, but, even it >> compressing correcly, internaly, the file name is wrong. After >> compressed, the original "atenção.txt" file was renamed to "atenþÒo.txt". >> >> I opened an issue here: >> >> http://bugs.freepascal.org/view.php?id=26213 >> >> I have a program that makes daily backups, and just discovered this >> problem when I noticed that it did not compressing files with names >> with special characters. > > This is not fixable. The zip standard has no rules for encoding filenames. > Whatever bytes you put in is what comes out. > You are entirely responsible for handling the encoding. Not quite, according to the spec http://www.pkware.com/documents/casestudies Appendix D language encoding/APPNOTE.TXT, D.2 supports UTF8. See also D.5 and an alternative approach D.6/D.7 (which allows for more backward compatibility) ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Re: [fpc-pascal] TZipper and special file names like "atenção.txt" (#26213)
On Fri, May 23, 2014 17:30, Michael Van Canneyt wrote: > On Fri, 23 May 2014, silvioprog wrote: > >> Hello, >> I've tried to compress a small file with TZipper class, but, even it >> compressing correcly, internaly, the file name is wrong. After >> compressed, the original "atenção.txt" file was renamed to >> "atenþÒo.txt". >> >> I opened an issue here: >> >> http://bugs.freepascal.org/view.php?id=26213 >> >> I have a program that makes daily backups, and just discovered this >> problem when I noticed that it did not compressing files with names >> with special characters. > > This is not fixable. > The zip standard has no rules for encoding filenames. > Whatever bytes you put in is what comes out. > You are entirely responsible for handling the encoding. While I more or less agree with your statement, I could find a note about some UTF-8 support extension (see http://stackoverflow.com/questions/15519493/how-to-add-zip-entry-with-utf-8-name-to-zip) which might help to certain extent (if supported by FPC). If nothing else, this would be probably relevant at least for a future (trunk) Unicode-compatible version of that class. Tomas ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Re: [fpc-pascal] TZipper and special file names like "atenção.txt" (#26213)
2014-05-23 12:30 GMT-03:00 Michael Van Canneyt : > > On Fri, 23 May 2014, silvioprog wrote: > >> Hello, >> I've tried to compress a small file with TZipper class, but, even it >> compressing correcly, internaly, the file name is wrong. After >> compressed, the original "atenção.txt" file was renamed to "atenþÒo.txt". >> >> I opened an issue here: >> >> http://bugs.freepascal.org/view.php?id=26213 >> >> I have a program that makes daily backups, and just discovered this >> problem when I noticed that it did not compressing files with names >> with special characters. >> > > This is not fixable. The zip standard has no rules for encoding filenames. > Whatever bytes you put in is what comes out. > You are entirely responsible for handling the encoding. Hum... isn't possible to works with files like "Silvio Clécio.txt"? :( I tested SynZIP (http://synopse.info/fossil/wiki?name=Downloads) in Delphi, and it worked fine: with TZipWrite.Create('Silvio Clécio.txt.zip') do try AddDeflated('Silvio Clécio.txt', True, 9); finally Free; end; But, is inviable to I port my backup program to Delphi now. -- Silvio Clécio My public projects - github.com/silvioprog ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Re: [fpc-pascal] TZipper and special file names like "atenção.txt" (#26213)
On Fri, 23 May 2014, silvioprog wrote: Hello, I've tried to compress a small file with TZipper class, but, even it compressing correcly, internaly, the file name is wrong. After compressed, the original "atenção.txt" file was renamed to "atenþÒo.txt". I opened an issue here: http://bugs.freepascal.org/view.php?id=26213 I have a program that makes daily backups, and just discovered this problem when I noticed that it did not compressing files with names with special characters. This is not fixable. The zip standard has no rules for encoding filenames. Whatever bytes you put in is what comes out. You are entirely responsible for handling the encoding. Michael.___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
[fpc-pascal] TZipper and special file names like "atenção.txt" (#26213)
Hello, I've tried to compress a small file with TZipper class, but, even it compressing correcly, internaly, the file name is wrong. After compressed, the original "atenção.txt" file was renamed to "atenþÒo.txt". I opened an issue here: http://bugs.freepascal.org/view.php?id=26213 I have a program that makes daily backups, and just discovered this problem when I noticed that it did not compressing files with names with special characters. Thank you! -- Silvio Clécio My public projects - github.com/silvioprog ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal