Re: [fpc-pascal] TZipper and special file names like atenção.txt (#26213)

2014-06-11 Thread silvioprog
2014-05-30 7:30 GMT-03:00 Reinier Olislagers reinierolislag...@gmail.com:

 On 27/05/2014 00:41, silvioprog wrote:
  Perfect! :)

 Please feel free to open that bug report if you haven't already

 Thanks.


Done: http://bugs.freepascal.org/view.php?id=26213

-- 
Silvio Clécio
My public projects - github.com/silvioprog
___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal

Re: [fpc-pascal] TZipper and special file names like atenção.txt (#26213)

2014-05-30 Thread Reinier Olislagers
On 27/05/2014 00:41, silvioprog wrote:
 Perfect! :)

Please feel free to open that bug report if you haven't already

Thanks.

___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal


Re: [fpc-pascal] TZipper and special file names like atenção.txt (#26213)

2014-05-26 Thread silvioprog
2014-05-24 5:02 GMT-03:00 Reinier Olislagers reinierolislag...@gmail.com:

 On 24/05/2014 08:33, Michael Van Canneyt wrote:
  On Fri, 23 May 2014, Craig Peterson wrote:
  The Info-zip project maintains an annotated Appnote that lists a bunch
  of the extra fields that various vendors use here:
  http://www.info-zip.org/doc/
  Strange then that the info-zip so often creates garbled filenames on
  Linux !
  Probably the used zipper doesn't use the extra fields.
 Of course it doesn't use them. It has no unicode support.


  Since it is optional, we can probably add a WideString property to the
  zipitem and add an overloaded call;
  Then we don't need to recreate everything,
 Recreate everything? Don't get what you mean here.

 just add the extra fields.
 (Yes, IIRC, the zip64 fix added support for extra fields already so it
 shouldn't be a big change)

 Agreed when writing.
 When reading, I'd suggest following the suggested behaviour in the spec:
 1. Try to read UTF8 filename from the EFS; if not present fall back to
 2. Try to read extra fields filename as implemented by info-zip/abbrevia
 etc, if not present fall back to
 3. current behaviour (reading filename as is)


Perfect! :)

-- 
Silvio Clécio
My public projects - github.com/silvioprog
___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal

Re: [fpc-pascal] TZipper and special file names like atenção.txt (#26213)

2014-05-24 Thread Michael Van Canneyt



On Fri, 23 May 2014, Craig Peterson wrote:


The Info-zip project maintains an annotated Appnote that lists a bunch of the 
extra fields that various vendors use here:
http://www.info-zip.org/doc/


Great, thank you for this information. Nice to see an expert on the list :)

Strange then that the info-zip so often creates garbled filenames on Linux !
Probably the used zipper doesn't use the extra fields. If it isn't there, 
infozip cannot use it obviously.

Since it is optional, we can probably add a WideString property to the zipitem 
and add an overloaded call;
Then we don't need to recreate everything, just add the extra fields.

Michael.
___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal


Re: [fpc-pascal] TZipper and special file names like atenção.txt (#26213)

2014-05-24 Thread Reinier Olislagers
On 24/05/2014 08:33, Michael Van Canneyt wrote:
 On Fri, 23 May 2014, Craig Peterson wrote:
 The Info-zip project maintains an annotated Appnote that lists a bunch
 of the extra fields that various vendors use here:
 http://www.info-zip.org/doc/
 Strange then that the info-zip so often creates garbled filenames on
 Linux !
 Probably the used zipper doesn't use the extra fields.
Of course it doesn't use them. It has no unicode support.


 Since it is optional, we can probably add a WideString property to the
 zipitem and add an overloaded call;
 Then we don't need to recreate everything, 
Recreate everything? Don't get what you mean here.

just add the extra fields.
(Yes, IIRC, the zip64 fix added support for extra fields already so it
shouldn't be a big change)

Agreed when writing.
When reading, I'd suggest following the suggested behaviour in the spec:
1. Try to read UTF8 filename from the EFS; if not present fall back to
2. Try to read extra fields filename as implemented by info-zip/abbrevia
etc, if not present fall back to
3. current behaviour (reading filename as is)

___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal


[fpc-pascal] TZipper and special file names like atenção.txt (#26213)

2014-05-23 Thread silvioprog
Hello,

I've tried to compress a small file with TZipper class, but, even it
compressing correcly, internaly, the file name is wrong. After compressed,
the original atenção.txt file was renamed to atenþÒo.txt.

I opened an issue here:

http://bugs.freepascal.org/view.php?id=26213

I have a program that makes daily backups, and just discovered this problem
when I noticed that it did not compressing files with names with special
characters.

Thank you!

-- 
Silvio Clécio
My public projects - github.com/silvioprog
___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal

Re: [fpc-pascal] TZipper and special file names like atenção.txt (#26213)

2014-05-23 Thread Michael Van Canneyt



On Fri, 23 May 2014, silvioprog wrote:


Hello,
I've tried to compress a small file with TZipper class, but, even it 
compressing correcly, internaly, the file name is wrong. After
compressed, the original atenção.txt file was renamed to atenþÒo.txt.

I opened an issue here:

http://bugs.freepascal.org/view.php?id=26213

I have a program that makes daily backups, and just discovered this problem 
when I noticed that it did not compressing files with names
with special characters.


This is not fixable. 
The zip standard has no rules for encoding filenames.

Whatever bytes you put in is what comes out.
You are entirely responsible for handling the encoding.

Michael.___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal

Re: [fpc-pascal] TZipper and special file names like atenção.txt (#26213)

2014-05-23 Thread silvioprog
2014-05-23 12:30 GMT-03:00 Michael Van Canneyt mich...@freepascal.org:

  On Fri, 23 May 2014, silvioprog wrote:

 Hello,
 I've tried to compress a small file with TZipper class, but, even it
 compressing correcly, internaly, the file name is wrong. After
 compressed, the original atenção.txt file was renamed to atenþÒo.txt.

 I opened an issue here:

 http://bugs.freepascal.org/view.php?id=26213

 I have a program that makes daily backups, and just discovered this
 problem when I noticed that it did not compressing files with names
 with special characters.


 This is not fixable. The zip standard has no rules for encoding filenames.
 Whatever bytes you put in is what comes out.
 You are entirely responsible for handling the encoding.


Hum... isn't possible to works with files like Silvio Clécio.txt? :(

I tested SynZIP (http://synopse.info/fossil/wiki?name=Downloads) in Delphi,
and it worked fine:

  with TZipWrite.Create('Silvio Clécio.txt.zip') do
  try
AddDeflated('Silvio Clécio.txt', True, 9);
  finally
Free;
  end;

But, is inviable to I port my backup program to Delphi now.

-- 
Silvio Clécio
My public projects - github.com/silvioprog
___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal

Re: [fpc-pascal] TZipper and special file names like atenção.txt (#26213)

2014-05-23 Thread Tomas Hajny
On Fri, May 23, 2014 17:30, Michael Van Canneyt wrote:
 On Fri, 23 May 2014, silvioprog wrote:

 Hello,
 I've tried to compress a small file with TZipper class, but, even it
 compressing correcly, internaly, the file name is wrong. After
 compressed, the original atenção.txt file was renamed to
 atenþÒo.txt.

 I opened an issue here:

 http://bugs.freepascal.org/view.php?id=26213

 I have a program that makes daily backups, and just discovered this
 problem when I noticed that it did not compressing files with names
 with special characters.

 This is not fixable.
 The zip standard has no rules for encoding filenames.
 Whatever bytes you put in is what comes out.
 You are entirely responsible for handling the encoding.

While I more or less agree with your statement, I could find a note about
some UTF-8 support extension (see
http://stackoverflow.com/questions/15519493/how-to-add-zip-entry-with-utf-8-name-to-zip)
which might help to certain extent (if supported by FPC). If nothing else,
this would be probably relevant at least for a future (trunk)
Unicode-compatible version of that class.

Tomas


___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal

Re: [fpc-pascal] TZipper and special file names like atenção.txt (#26213)

2014-05-23 Thread Reinier Olislagers
On 23/05/2014 17:30, Michael Van Canneyt wrote:
 On Fri, 23 May 2014, silvioprog wrote:
 I've tried to compress a small file with TZipper class, but, even it
 compressing correcly, internaly, the file name is wrong. After
 compressed, the original atenção.txt file was renamed to atenþÒo.txt.

 I opened an issue here:

 http://bugs.freepascal.org/view.php?id=26213

 I have a program that makes daily backups, and just discovered this
 problem when I noticed that it did not compressing files with names
 with special characters.
 
 This is not fixable. The zip standard has no rules for encoding filenames.
 Whatever bytes you put in is what comes out.
 You are entirely responsible for handling the encoding.

Not quite, according to the spec http://www.pkware.com/documents/casestudies
Appendix D language encoding/APPNOTE.TXT, D.2 supports UTF8.
See also D.5
and an alternative approach D.6/D.7 (which allows for more backward
compatibility)

___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal


Re: [fpc-pascal] TZipper and special file names like atenção.txt (#26213)

2014-05-23 Thread Tomas Hajny
On Fri, May 23, 2014 17:38, Reinier Olislagers wrote:
 On 23/05/2014 17:30, Michael Van Canneyt wrote:
 On Fri, 23 May 2014, silvioprog wrote:
 .
 .
 I have a program that makes daily backups, and just discovered this
 problem when I noticed that it did not compressing files with names
 with special characters.

 This is not fixable. The zip standard has no rules for encoding
 filenames.
 Whatever bytes you put in is what comes out.
 You are entirely responsible for handling the encoding.

 Not quite, according to the spec
 http://www.pkware.com/documents/casestudies
 .
 .

I can't access this URL (HTTP 403 - Forbidden). The right URL seems to be
http://www.pkware.com/documents/casestudies/APPNOTE.TXT.

Tomas


___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal


Re: [fpc-pascal] TZipper and special file names like atenção.txt (#26213)

2014-05-23 Thread Michael Van Canneyt



On Fri, 23 May 2014, Tomas Hajny wrote:


On Fri, May 23, 2014 17:38, Reinier Olislagers wrote:

On 23/05/2014 17:30, Michael Van Canneyt wrote:

On Fri, 23 May 2014, silvioprog wrote:

.
.

I have a program that makes daily backups, and just discovered this
problem when I noticed that it did not compressing files with names
with special characters.


This is not fixable. The zip standard has no rules for encoding
filenames.
Whatever bytes you put in is what comes out.
You are entirely responsible for handling the encoding.


Not quite, according to the spec
http://www.pkware.com/documents/casestudies

.
.

I can't access this URL (HTTP 403 - Forbidden). The right URL seems to be
http://www.pkware.com/documents/casestudies/APPNOTE.TXT.


Indeed.

A proper implementation would require a serious rewrite of this component
with a unicode API. This can be entered as a feature request if so desired.

Michael.
___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal


Re: [fpc-pascal] TZipper and special file names like atenção.txt (#26213)

2014-05-23 Thread silvioprog
2014-05-23 13:28 GMT-03:00 Michael Van Canneyt mich...@freepascal.org:

  On Fri, 23 May 2014, Tomas Hajny wrote:

 On Fri, May 23, 2014 17:38, Reinier Olislagers wrote:

 On 23/05/2014 17:30, Michael Van Canneyt wrote:

 On Fri, 23 May 2014, silvioprog wrote:

  I have a program that makes daily backups, and just discovered this
 problem when I noticed that it did not compressing files with names
 with special characters.


 This is not fixable. The zip standard has no rules for encoding
 filenames.
 Whatever bytes you put in is what comes out.
 You are entirely responsible for handling the encoding.


 Not quite, according to the spec
 http://www.pkware.com/documents/casestudies


 I can't access this URL (HTTP 403 - Forbidden). The right URL seems to be
 http://www.pkware.com/documents/casestudies/APPNOTE.TXT.


 Indeed.

 A proper implementation would require a serious rewrite of this component
 with a unicode API. This can be entered as a feature request if so desired.


Nice. I can do it, opening a new issue in bugtracker.

Temporarily, I fixed my problem 'underlining' the file name, e.g.:

program project1;

{$mode objfpc}{$H+}

uses
  zipper, zstream, sysutils, RUtils {
https://github.com/silvioprog/rutils/blob/master/src/rutils.pas };

var
  dir, fil, fn, zip: string;
begin
  dir := 'C:\Silvio Clécio\Cópias de segurança\';
  fn := 'instrução-para-desbloqueio-do-PIN.pdf';
  fil := dir + fn;
  zip := dir + RUtils.UnderlineStr(fn + '.zip');
  with TZipper.Create do
  try
FileName :=
  {$IFDEF MSWINDOWS}Utf8ToAnsi({$ENDIF}zip{$IFDEF MSWINDOWS}){$ENDIF};
Entries.AddFileEntry(
  {$IFDEF MSWINDOWS}Utf8ToAnsi({$ENDIF}fil{$IFDEF MSWINDOWS}){$ENDIF},
  RUtils.UnderlineStr(fn)).CompressionLevel := clMax;
ZipAllFiles;
  finally
Free;
  end;
end.

Thanks!

-- 
Silvio Clécio
My public projects - github.com/silvioprog
___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal

Re: [fpc-pascal] TZipper and special file names like atenção.txt (#26213)

2014-05-23 Thread Craig Peterson
 Nice. I can do it, opening a new issue in bugtracker.

Filename encoding in zip files is poorly defined.  The current
APPNOTE.txt says that the only valid encoding is OEM 437, with UTF-8 if
a bit is set in the header, but those were recent additions, and in
practice Windows applications will generally use either the OEM or ANSI
codepage of the current system locale, and files generated on Unix will
be UTF-8 but won't have the language encoding bit set.

Abbrevia's zip encoding/decoding tries to handle the issue in as
compatible a manner as possible.  It stores the original filenames as
OEM/ANSI based on the current system, and stores a UTF-8 copy in an
extended header so there's a known way to decode it when changing
locales.  When reading it has to use lookup tables to guess if the
filenames are likely OEM or ANSI.  On Unicode-enabled Delphi releases
it's fully Unicode enabled; on FreePascal and older Delphi releases it
only supports ANSI filenames but still does proper encoding/decoding.

The relevant code is in AbZipTyp.pas in TAbZipItem.SetFilename and
TAbZipItem.LoadFromStream if you want a reference.  It's under the MPL,
but I'm the original author and I'm happy to relicense it if someone
else wants to incorporate the code into paszlib.

https://sourceforge.net/p/tpabbrevia/code/HEAD/tree/trunk/source/AbZipTyp.pas

-- 
Craig Peterson
Scooter Software

___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal


Re: [fpc-pascal] TZipper and special file names like atenção.txt (#26213)

2014-05-23 Thread silvioprog
2014-05-23 15:50 GMT-03:00 Craig Peterson cr...@scootersoftware.com:

  Nice. I can do it, opening a new issue in bugtracker.

 Filename encoding in zip files is poorly defined.  The current
 APPNOTE.txt says that the only valid encoding is OEM 437, with UTF-8 if
 a bit is set in the header, but those were recent additions, and in
 practice Windows applications will generally use either the OEM or ANSI
 codepage of the current system locale, and files generated on Unix will
 be UTF-8 but won't have the language encoding bit set.

 Abbrevia's zip encoding/decoding tries to handle the issue in as
 compatible a manner as possible.  It stores the original filenames as
 OEM/ANSI based on the current system, and stores a UTF-8 copy in an
 extended header so there's a known way to decode it when changing
 locales.  When reading it has to use lookup tables to guess if the
 filenames are likely OEM or ANSI.  On Unicode-enabled Delphi releases
 it's fully Unicode enabled; on FreePascal and older Delphi releases it
 only supports ANSI filenames but still does proper encoding/decoding.

 The relevant code is in AbZipTyp.pas in TAbZipItem.SetFilename and
 TAbZipItem.LoadFromStream if you want a reference.  It's under the MPL,
 but I'm the original author and I'm happy to relicense it if someone
 else wants to incorporate the code into paszlib.


 https://sourceforge.net/p/tpabbrevia/code/HEAD/tree/trunk/source/AbZipTyp.pas

 --
 Craig Peterson
 Scooter Software


Very nice.

I have a question. Adding this extended header, can I open/uncompress the
zip file normally in programs like 7z and WinRAR?

-- 
Silvio Clécio
My public projects - github.com/silvioprog
___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal

Re: [fpc-pascal] TZipper and special file names like atenção.txt (#26213)

2014-05-23 Thread Craig Peterson
On May 23, 2014, at 8:26 PM, silvioprog silviop...@gmail.com wrote:

 2014-05-23 15:50 GMT-03:00 Craig Peterson cr...@scootersoftware.com
 I have a question. Adding this extended header, can I open/uncompress the zip 
 file normally in programs like 7z and WinRAR?

Yes.  The appnote describes the format for the extra field, which is 
extensible so multiple records can be stored and applications can skip over any 
they don't understand. 7-zip doesn't use it, but doesn't have a problem with 
it. I'm not sure about WinRAR, but WinZip will use the header if it exists. I 
didn't invent it for Abbrevia; it was originally designed by the Info-zip guys. 

-- 
Craig Peterson
Scooter Software___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal

Re: [fpc-pascal] TZipper and special file names like atenção.txt (#26213)

2014-05-23 Thread Craig Peterson
The Info-zip project maintains an annotated Appnote that lists a bunch of the 
extra fields that various vendors use here:

http://www.info-zip.org/doc/

-- 
Craig Peterson
Scooter Software___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal