Re: [compress] New draft 5

2006-05-03 Thread C. Grobmeier
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Henri Yandell wrote:
 If you could fill out the software grant:
 
 http://www.apache.org/licenses/#grants
 
 and either fax or postal-mail it (see instructions in first paragraph
 of http://www.apache.org/licenses/icla.txt;).

[x] done via fax
Please come back to me if the fax doesn't arrive

chris
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.2.1 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFEWNvekv8rKBUE/T4RAnqGAJ9YQKz7KEebe2uQGvEkCgSVlxiSMwCcCSKD
y8jvogkVakzI1PtEqOXENaw=
=w/Xo
-END PGP SIGNATURE-

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [compress] New draft 5

2006-05-02 Thread C. Grobmeier
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Henri Yandell wrote:
 If you could fill out the software grant:
 
 http://www.apache.org/licenses/#grants
 
 and either fax or postal-mail it (see instructions in first paragraph
 of http://www.apache.org/licenses/icla.txt;).
 

I will fill out the form and send it today.

 It's better to get the code in early and then have comments be patches
 than continually keep sending you back to the drawing board I think.

I agree. Having a high level look at the discussions followed by my
proposal, i will prepare a new draft. So please hold on before comitting :-)

Regards,
Chris
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.2.1 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFEVydgkv8rKBUE/T4RAqWzAKCReSn/6vPAJk4W+wchxSAVTjLH7ACfQCoz
BJKgL+JWWWI0eitr/VtB8yQ=
=SbNP
-END PGP SIGNATURE-

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [compress] New draft 5

2006-05-02 Thread C. Grobmeier
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Sandy McArthur wrote:
 On 4/30/06, C. Grobmeier [EMAIL PROTECTED] wrote:
 here is a new draft for the compress interface:

 * http://grobmeier.de/commons-compress-draft-5.zip
 
 Same as before, suggestions as a stream of consciousness, take the
 ones you like:
 

Thanks again to all for your comments. I will think about em, create a
new draft and come back here in a few days.

Regards,
Chris
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.2.1 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFEVyfkkv8rKBUE/T4RAsarAJ0dWH03RbU8I1C8ib6Jl5V9RKLovwCeM2VM
L5n+xE2rW5fXJkwmK+Y9Veo=
=M4TW
-END PGP SIGNATURE-

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [compress] New draft 5

2006-05-02 Thread C. Grobmeier
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

 My fault for only just joining in etc, but why have the underlying
 Tar/Zip classes become package scoped? Does the bridging API replace
 all need to dig into the lower-level APIs?
 

I was not thinking enough about this. In my fault i was thinking its a
good idea to lower visibility. But i didn't think about if the bridge
has enough functionallity or not.

 If so, why are the bzip classes still public?

Cause bzip still needs refactoring. There were compile errors on
severeal points, and i didn't want to dig all up.

 Should there be ZipCompressor and GZipCompressor implementations?

In my opinion, yes.

 Any reason for there not being an archiveToStream method? Especially
 in the case of the Tar implementation where you'd want to attach a
 TarArchiver to a GZipCompressor.

I don't know any reason not to do so. I will remind this in Draft 6.
Thanks,
Chris
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.2.1 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFEVymhkv8rKBUE/T4RAk0yAJ9Jx6+pfSRVvsuApFCbZpkKiO7AwQCeJPLx
g3rJSBX+ovIEPZrW7JKJnLY=
=YsaR
-END PGP SIGNATURE-

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [compress] New draft 5

2006-05-02 Thread Torsten Curdt

On 5/1/06, Sandy McArthur [EMAIL PROTECTED] wrote:

On 4/30/06, will pugh [EMAIL PROTECTED] wrote:
 1)  You often change method names based on the parameter types, e.g.
 Archiver.addFile + Archiver.addFileName, setUnpackDestinationName +
 setUnpackDestinationFile, etc.  It seems more conventional  and less
 chaotic to give all the methods the same name, and have them only differ
 based on parameter.  Examples of this style are constructors for
 java.utils.zip.ZipFile, java.io.FileInputStream,
 java.io.FileOutputStream, org.apache.commons.io.IOUtils.IOUtils.copy,
 org.apache.commons.io.FileUtils.isFileNewer, etc

This is intentional, in a previous version the method names were the
same. The problem with using the same name but different param types
breaks the JavaBean property getter/setter rules and the classes will
not be as usable in at least some scripting environments.


I am sorry ...whoever wants to use this API from a JavaBean
can easily write a little wrapper class. I rather have a slick API
design than adhering to the JavaBean interface.

My 2 cents

cheers
--
Torsten

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [compress] New draft 5

2006-05-02 Thread Torsten Curdt

Sorry, joining the party so late ...but this thread let me to actually
have a look ;)

So my comments:

o Don't like that compressor does both compression and decompression.
o Always use File not String
o Please parameter overloading ...this method naming scheme makes
  the API really ugly

But to be a bit more constructive... I would think more of something like:

Compressor compressor = CompressorFactory.newInstance(new File(file.bz2));
Compressor compressor = CompressorFactory.newInstance(bzip2);
Compressor compressor = new Bzip2Compressor();
Compressor compressor = Bzip2Compressor.getInstance();
File output = compressor.compress(new File(input));
File output = compressor.compress(new FileInputStream(input));
InputStream in = compressor.compressStream(new File(input))
InputStream in = compressor.compressStream(new FileInputStream(input))
compressor.compressTo(new File(input), new File(output));
compressor.compressTo(new FileInputStream(input), new
FileOutputStream(output));

Decompressor decompressor = DecompressorFactory.newInstance(bzip2);
File output = decompressor.decompress(new File(input))
File output = decompressor.decompress(new FileInputStream(input));
InputStream in = decompressor.decompressStream(new File(input))
InputStream in = decompressor.decompressStream(new FileInputStream(input))
decompressor.decompressTo(new File(input), new File(output));
decompressor.decompressTo(new FileInputStream(input), new
FileOutputStream(output));

Archive arch = new TarArchive(new File(my.tar));
Archive arch = ArchiveFactory.newInstance(new File(my.tar);
arch.add(new File(input));
arch.add(new FileInputStream(input));
arch.delete(path);
arch.save();
arch.save(new File(output));
arch.save(new FileOutputStream(out));
for(Iterator it = arch.iterator(); it.hasNext(); it) {
 ArchiveEntry entry = (ArchiveEntry) it.next();
 ...
}
ArchiveEntry entry = arch.getEntry(path);
entry.extract(new File(output));
entry.extract(new FileOutputStream(output));
entry.delete();

cheers
--
Torsten

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [compress] New draft 5

2006-05-02 Thread Henri Yandell

On 5/2/06, Torsten Curdt [EMAIL PROTECTED] wrote:

Sorry, joining the party so late ...but this thread let me to actually
have a look ;)

So my comments:

o Don't like that compressor does both compression and decompression.


I'm ambivalent on this one.


o Always use File not String


+1. Only use 'String filename' when you're doing something to a
filename - otherwise we bloat the APIs for the sake of a new
File(xxx). It's not worth it.

Also, the setXxx stuff seems a bit unnecessary.

For the interface, minimal and stateless seem like important goals.
Having setXxx just means that the implementors have to worry about
thread safety.

Hen

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [compress] New draft 5

2006-05-02 Thread C. Grobmeier
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

 So my comments:

 o Don't like that compressor does both compression and decompression.
 
 I'm ambivalent on this one.
 

If we would do so, we have 4 different interfaces. But i like the look.

 o Always use File not String
 
 +1. Only use 'String filename' when you're doing something to a
 filename - otherwise we bloat the APIs for the sake of a new
 File(xxx). It's not worth it.

Agreed, i will delete that.

 Also, the setXxx stuff seems a bit unnecessary.
 
 For the interface, minimal and stateless seem like important goals.
 Having setXxx just means that the implementors have to worry about
 thread safety.

OK, my new draft will remind that. I will keep it pure ;-)

Chris
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.2.1 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFEV50/kv8rKBUE/T4RAiYiAKCJEGc8eQ07MgenqF/BNBjr4rQ8UQCfRIhL
8K3+oBrOYvXuAGIyUuR/yQM=
=UBTS
-END PGP SIGNATURE-

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [compress] New draft 5

2006-05-01 Thread will pugh


Sandy McArthur wrote:


On 4/30/06, will pugh [EMAIL PROTECTED] wrote:


1)  You often change method names based on the parameter types, e.g.
Archiver.addFile + Archiver.addFileName, setUnpackDestinationName +
setUnpackDestinationFile, etc.  It seems more conventional  and less
chaotic to give all the methods the same name, and have them only differ
based on parameter.  Examples of this style are constructors for
java.utils.zip.ZipFile, java.io.FileInputStream,
java.io.FileOutputStream, org.apache.commons.io.IOUtils.IOUtils.copy,
org.apache.commons.io.FileUtils.isFileNewer, etc



This is intentional, in a previous version the method names were the
same. The problem with using the same name but different param types
breaks the JavaBean property getter/setter rules and the classes will
not be as usable in at least some scripting environments.


Hmm.  What scripting envirionments don't allow you to call this kind of 
Java method?  I looked at the documentation for Jythong, JRuby and 
Groovy and they all seemed to be O.K. with dealing with methods that 
differ by parameter types.


Also, it doesn't seem like the current interface adheres to the Java 
Bean specification anyways (unless I'm missing somethign).   If you 
wanted to adhere to the Java Beans Spec, I would suggest having only one 
getter and one setter for each property, i.e. not having a 
setUnpackDestinationFile and setUnpackDestinationFileName, but only 
having setUnpackDestination(File f).


As I see it, here is the list of areas I think the archiver interface 
offers odd behaviour relative to  the JavaBean spec:


 1)  The sourceFile property is recieved by getSourceFile(), but is set 
by loadFile and loadFilebyPath.  Following the JavaBeans rules, this 
means there is a readonly property called sourceFile, rather than this 
being a property that is gettable and settable.


   In addition to load* are bad names because they don't actually load 
anything.


 2)  The unpackDestination property has a getter, but no setter.
  The unpackDestinationFile property has a setter but no getter.
  The unpackDestinationName property has a setter but no getter



4)  None of your interfaces deal with the case of what to do if the
destination file is already there.  Choices are either defining it in
the interface, or adding a property defining what to do.  I would
suggest the latter, and would suggest that for Archiver this method
should take a FileFilter (since the unpack behaviour can be non-trivial)
and should default to FalseFileFilter.INSTANCE.  For the Compresser
interface a simple boolean is probably sufficient.  e.g.

Archiver.setOverwriteFilter(TrueFileFilter.INSTANCE)
Compresser.setOverwrite(true);



I'll disagree that this should be configurable. IMO it should just do
whatever FileOutputStream or whatever is used does. It's the calling
code's responsibility to handle any name collisions.

How can the calling code handle name collisions?  It takes significantly 
more work for it to walk through all the entries in the archive and then 
walking through all the files in the file ssytem.  Are you suggesting a 
caller should only unpack into an empty directory? 

I'm not really sure I understand what your saying about 
FileOutputStream.  The apis for the Archiver don't allow you to set an 
OutputStream, and it's not really applicable to upack (which is writing 
many files). 

FileOutputStream allows you to set a boolean for whether the stream 
should overwrite or append.  This is made more difficult in the unpack 
scenario where there are more than one file.  From personal experience, 
I liked the solution used in FileUtils.iterateFiles and 
FileUtils.listFiles for doing yeah/nay on a recursive operation like this.



   Cheers,
   --Will


--
Sandy McArthur

He who dares not offend cannot be honest.
- Thomas Paine

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [compress] New draft 5

2006-05-01 Thread Henri Yandell

On 4/30/06, C. Grobmeier [EMAIL PROTECTED] wrote:

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Hello all,

here is a new draft for the compress interface:

* http://grobmeier.de/commons-compress-draft-5.zip


My fault for only just joining in etc, but why have the underlying
Tar/Zip classes become package scoped? Does the bridging API replace
all need to dig into the lower-level APIs?

If so, why are the bzip classes still public?

Should there be ZipCompressor and GZipCompressor implementations?

Any reason for there not being an archiveToStream method? Especially
in the case of the Tar implementation where you'd want to attach a
TarArchiver to a GZipCompressor.

Hen

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [compress] New draft 5

2006-05-01 Thread Sandy McArthur

On 5/1/06, will pugh [EMAIL PROTECTED] wrote:


Sandy McArthur wrote:

 On 4/30/06, will pugh [EMAIL PROTECTED] wrote:

 1)  You often change method names based on the parameter types, e.g.
 Archiver.addFile + Archiver.addFileName, setUnpackDestinationName +
 setUnpackDestinationFile, etc.  It seems more conventional  and less
 chaotic to give all the methods the same name, and have them only differ
 based on parameter.  Examples of this style are constructors for
 java.utils.zip.ZipFile, java.io.FileInputStream,
 java.io.FileOutputStream, org.apache.commons.io.IOUtils.IOUtils.copy,
 org.apache.commons.io.FileUtils.isFileNewer, etc


 This is intentional, in a previous version the method names were the
 same. The problem with using the same name but different param types
 breaks the JavaBean property getter/setter rules and the classes will
 not be as usable in at least some scripting environments.

Hmm.  What scripting envirionments don't allow you to call this kind of
Java method?  I looked at the documentation for Jythong, JRuby and
Groovy and they all seemed to be O.K. with dealing with methods that
differ by parameter types.

Also, it doesn't seem like the current interface adheres to the Java
Bean specification anyways (unless I'm missing somethign).   If you
wanted to adhere to the Java Beans Spec, I would suggest having only one
getter and one setter for each property, i.e. not having a
setUnpackDestinationFile and setUnpackDestinationFileName, but only
having setUnpackDestination(File f).


I'll agree it's not strict adherence to the JavaBean spec but
following JavaBean conventions even partially can be useful. A few
years ago I was treating some JavaMail objects like beans and between
Java 1.3 and 1.4 the rules on how Java determined what was a legal
javabean property got more strict. This caused problems when I did
something similar to message.bodyPart.subject in a JSP el
expression. and there was one getBodyPart() and two setBodyPart(Type1)
and a setBodyPart(Type2).


As I see it, here is the list of areas I think the archiver interface
offers odd behaviour relative to  the JavaBean spec:

  1)  The sourceFile property is recieved by getSourceFile(), but is set
by loadFile and loadFilebyPath.  Following the JavaBeans rules, this
means there is a readonly property called sourceFile, rather than this
being a property that is gettable and settable.

In addition to load* are bad names because they don't actually load
anything.

  2)  The unpackDestination property has a getter, but no setter.
   The unpackDestinationFile property has a setter but no getter.
   The unpackDestinationName property has a setter but no getter


Okay, I'll agree that unpackDestinationFile getter should be renamed
to unpackDestination but the unpackDestinationName property should
still keep a separate name and it's okay if it's a dynamic property
that really sets the unpackDestination property.


 4)  None of your interfaces deal with the case of what to do if the
 destination file is already there.  Choices are either defining it in
 the interface, or adding a property defining what to do.  I would
 suggest the latter, and would suggest that for Archiver this method
 should take a FileFilter (since the unpack behaviour can be non-trivial)
 and should default to FalseFileFilter.INSTANCE.  For the Compresser
 interface a simple boolean is probably sufficient.  e.g.

 Archiver.setOverwriteFilter(TrueFileFilter.INSTANCE)
 Compresser.setOverwrite(true);


 I'll disagree that this should be configurable. IMO it should just do
 whatever FileOutputStream or whatever is used does. It's the calling
 code's responsibility to handle any name collisions.

How can the calling code handle name collisions?  It takes significantly
more work for it to walk through all the entries in the archive and then
walking through all the files in the file ssytem.  Are you suggesting a
caller should only unpack into an empty directory?

I'm not really sure I understand what your saying about
FileOutputStream.  The apis for the Archiver don't allow you to set an
OutputStream, and it's not really applicable to upack (which is writing
many files).


I was confused about what you were referring to. I was thinking you
wanted to specify the behavior for what would happen for creating a
new archive on to an existing name.

I'll agree the behavior for unpacking into an existing directory tree
should be improved.

--
Sandy McArthur

He who dares not offend cannot be honest.
- Thomas Paine

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [compress] New draft 5

2006-05-01 Thread will pugh
I may be missing some requirements here, but it seems like in light of 
the JavaBean bits we were talking about, that it makes sense to restrict 
the properties on the actual interface, and have them correspond 
directly to JavaBean properties.  In addition, I would propose getting 
rid of some properties and moving them onto the method calls.  Is there 
a reason not to?


For an interface that basically does two things, it is suprising that it 
has 16 methods on it.  Of those, 9 appear to be getter/setters on 
properties.  Those 9 getters/setters refer to 3 properties in 
AbstractArchiver.  Of those 3 properties, only 1 is used by more than 
one method (that is not a getter/setter).  Why do we have 
destinationFile and unpackDestination properties on the object, if they 
are only used in one place?  Why not just make these parameters? If you 
did this, you would have an interface that offers the exact same 
functionality, but has only 6 methods.  Seems like there is a bit of 
extra entropy.


In addition, I sorta like the  way java.io.File, java.util.zip.ZipFile, 
etc let you set the source file in the constructor.  It makes the code 
using the objects less verbose, with no loss of readability.  I would 
suggest that we add this to the archivers, by adding constructors that 
can take a string or a file for the source file.  In addition, we can 
bubble those changes up to the ArchiverType class.


What is the purpose of the ArchiverType class?  I'm assuming the idea is 
that this helps folks build factories, but that it is perfectly 
acceptable and encouraged to create any of the archivers via 'new'.  Is 
that correct?


  Thanks,
  --Will


Sandy McArthur wrote:


On 5/1/06, will pugh [EMAIL PROTECTED] wrote:



Sandy McArthur wrote:

 On 4/30/06, will pugh [EMAIL PROTECTED] wrote:

 1)  You often change method names based on the parameter types, e.g.
 Archiver.addFile + Archiver.addFileName, setUnpackDestinationName +
 setUnpackDestinationFile, etc.  It seems more conventional  and less
 chaotic to give all the methods the same name, and have them only 
differ

 based on parameter.  Examples of this style are constructors for
 java.utils.zip.ZipFile, java.io.FileInputStream,
 java.io.FileOutputStream, org.apache.commons.io.IOUtils.IOUtils.copy,
 org.apache.commons.io.FileUtils.isFileNewer, etc


 This is intentional, in a previous version the method names were the
 same. The problem with using the same name but different param types
 breaks the JavaBean property getter/setter rules and the classes will
 not be as usable in at least some scripting environments.

Hmm.  What scripting envirionments don't allow you to call this kind of
Java method?  I looked at the documentation for Jythong, JRuby and
Groovy and they all seemed to be O.K. with dealing with methods that
differ by parameter types.

Also, it doesn't seem like the current interface adheres to the Java
Bean specification anyways (unless I'm missing somethign).   If you
wanted to adhere to the Java Beans Spec, I would suggest having only one
getter and one setter for each property, i.e. not having a
setUnpackDestinationFile and setUnpackDestinationFileName, but only
having setUnpackDestination(File f).



I'll agree it's not strict adherence to the JavaBean spec but
following JavaBean conventions even partially can be useful. A few
years ago I was treating some JavaMail objects like beans and between
Java 1.3 and 1.4 the rules on how Java determined what was a legal
javabean property got more strict. This caused problems when I did
something similar to message.bodyPart.subject in a JSP el
expression. and there was one getBodyPart() and two setBodyPart(Type1)
and a setBodyPart(Type2).


As I see it, here is the list of areas I think the archiver interface
offers odd behaviour relative to  the JavaBean spec:

  1)  The sourceFile property is recieved by getSourceFile(), but is set
by loadFile and loadFilebyPath.  Following the JavaBeans rules, this
means there is a readonly property called sourceFile, rather than this
being a property that is gettable and settable.

In addition to load* are bad names because they don't actually load
anything.

  2)  The unpackDestination property has a getter, but no setter.
   The unpackDestinationFile property has a setter but no getter.
   The unpackDestinationName property has a setter but no getter



Okay, I'll agree that unpackDestinationFile getter should be renamed
to unpackDestination but the unpackDestinationName property should
still keep a separate name and it's okay if it's a dynamic property
that really sets the unpackDestination property.


 4)  None of your interfaces deal with the case of what to do if the
 destination file is already there.  Choices are either defining it in
 the interface, or adding a property defining what to do.  I would
 suggest the latter, and would suggest that for Archiver this method
 should take a FileFilter (since the unpack behaviour can be 

Re: [compress] New draft 5

2006-05-01 Thread will pugh

I'm not sure if there is some requirement I'm missing, but

1)  It still seems strange to me that we have an interface that does two 
things:  pack an archive + unpack an archive, yet it has 16 methods on it.
   Of these, 9 seem to be some variation of property getters/setters.  
As far as I can tell, none of these properties are used by both the pack 
and unpack process.


Methods that set/get properties only used


1)  Why do we need getters and setters that take both a File and a 
String as parameters?  Why can't we standardize on just having a File 
type for sourceFile and unpackDestination.


2)  I'm still not sure why unpackDestination is a property and not just 
a parameter on unpack?  It seems that it is a property that is used in 
one and only one place.  It seems like logically its not a property of 
the Archiver as much as a property of the actual packing process.


3)  Why don't we have a constructor for each of the archivers that can 
take a String or a File as the SourceFile (similar to how java.io.File, 
java.util.zip.ZipFile, etc deal with this).  If you didn't want to 
enforce this as a convention for Archivers, it could easily be added 
into the ArchiverType.getInstance() method.  We could have ArchiverType 
provide two getInstance methods, one could take a string and one could 
take a file.  This way the interface can be minimal, and we can hoist 
type conversions off to helpers rather than the interface


4)  What is the purpose of the ArchiverType?  I'm assuming this is meant 
to help people build their own factories?  but the common case for 
creating an Archiver is still to create using 'new'.  Is this right?



   --Will

Sandy McArthur wrote:


On 5/1/06, will pugh [EMAIL PROTECTED] wrote:



Sandy McArthur wrote:

 On 4/30/06, will pugh [EMAIL PROTECTED] wrote:

 1)  You often change method names based on the parameter types, e.g.
 Archiver.addFile + Archiver.addFileName, setUnpackDestinationName +
 setUnpackDestinationFile, etc.  It seems more conventional  and less
 chaotic to give all the methods the same name, and have them only 
differ

 based on parameter.  Examples of this style are constructors for
 java.utils.zip.ZipFile, java.io.FileInputStream,
 java.io.FileOutputStream, org.apache.commons.io.IOUtils.IOUtils.copy,
 org.apache.commons.io.FileUtils.isFileNewer, etc


 This is intentional, in a previous version the method names were the
 same. The problem with using the same name but different param types
 breaks the JavaBean property getter/setter rules and the classes will
 not be as usable in at least some scripting environments.

Hmm.  What scripting envirionments don't allow you to call this kind of
Java method?  I looked at the documentation for Jythong, JRuby and
Groovy and they all seemed to be O.K. with dealing with methods that
differ by parameter types.

Also, it doesn't seem like the current interface adheres to the Java
Bean specification anyways (unless I'm missing somethign).   If you
wanted to adhere to the Java Beans Spec, I would suggest having only one
getter and one setter for each property, i.e. not having a
setUnpackDestinationFile and setUnpackDestinationFileName, but only
having setUnpackDestination(File f).



I'll agree it's not strict adherence to the JavaBean spec but
following JavaBean conventions even partially can be useful. A few
years ago I was treating some JavaMail objects like beans and between
Java 1.3 and 1.4 the rules on how Java determined what was a legal
javabean property got more strict. This caused problems when I did
something similar to message.bodyPart.subject in a JSP el
expression. and there was one getBodyPart() and two setBodyPart(Type1)
and a setBodyPart(Type2).


As I see it, here is the list of areas I think the archiver interface
offers odd behaviour relative to  the JavaBean spec:

  1)  The sourceFile property is recieved by getSourceFile(), but is set
by loadFile and loadFilebyPath.  Following the JavaBeans rules, this
means there is a readonly property called sourceFile, rather than this
being a property that is gettable and settable.

In addition to load* are bad names because they don't actually load
anything.

  2)  The unpackDestination property has a getter, but no setter.
   The unpackDestinationFile property has a setter but no getter.
   The unpackDestinationName property has a setter but no getter



Okay, I'll agree that unpackDestinationFile getter should be renamed
to unpackDestination but the unpackDestinationName property should
still keep a separate name and it's okay if it's a dynamic property
that really sets the unpackDestination property.


 4)  None of your interfaces deal with the case of what to do if the
 destination file is already there.  Choices are either defining it in
 the interface, or adding a property defining what to do.  I would
 suggest the latter, and would suggest that for Archiver this method
 should take a FileFilter (since the unpack behaviour can be 

Re: [compress] New draft 5

2006-04-30 Thread Henri Yandell

Sounds good, I'll look at the code tonight (hopefully) and comment; in
the meantime, let's go ahead and get a software grant moving for the
code.

If you could fill out the software grant:

http://www.apache.org/licenses/#grants

and either fax or postal-mail it (see instructions in first paragraph
of http://www.apache.org/licenses/icla.txt;).

It's better to get the code in early and then have comments be patches
than continually keep sending you back to the drawing board I think.

Hen

On 4/30/06, C. Grobmeier [EMAIL PROTECTED] wrote:

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Hello all,

here is a new draft for the compress interface:

* http://grobmeier.de/commons-compress-draft-5.zip

I have improved a lot of things, based on the comments of Sandy (thanks
for that).

This draft is not perfect, as you can see in the todo list. But imho we
have a quite good base for future work, everything compiles and works
and is documented. So i would like to propose that someone is comitting
this, except you have reasons not to do this.
If you agree with me, i will open a bugzilla issue and add the link to
the code.

- - Chris.
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.2.1 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFEVKMikv8rKBUE/T4RAlxxAKCMYzova2roWDA/skRyoDvFcErE2gCfTjFw
bT2VrGdR8Byt+VjsRo7Cyhw=
=1GRF
-END PGP SIGNATURE-

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [compress] New draft 5

2006-04-30 Thread Sandy McArthur

On 4/30/06, C. Grobmeier [EMAIL PROTECTED] wrote:

here is a new draft for the compress interface:

* http://grobmeier.de/commons-compress-draft-5.zip


Same as before, suggestions as a stream of consciousness, take the
ones you like:

package org.apache.commons.compress:

In Archiver:
Why do pack() and packToFilename(String) both return a boolean and
throws a PackException? The Javadocs state @return true, if the
operation has been ended without exceptions. IMO the return type
isn't meaningful as if there was an exception that would be thrown,
and a return of false would never actually happen. When is it
meaningful to swallow an exception and still return false?

Does the Archiver ever unpack to a file or only a directory? In Java a
directory is represented as a File but I think it would be nice if the
Javadocs made that clear for getUnpackDestination(),
setUnpackDestinationName(String), and setUnpackDestinationFile(File).
And maybe those setters should throw an exception when the parameter
isn't a directory.

I don't really like that the Archiver interface combines both both
pack and unpack and there are difference sets of  properties for each
operation. In my mind you have the concept of an Archive and then you
ask for another implementation of Exploder and Packer interfaces.
Each one only does one thing and the Archiver can choose to make both
types available, only make one type available, support both types but
throw an exception when the second type is requested to enforce one
activity at a time.

In Compressor:
I don't see a need to expose that a FileInputStream is being used
internally. Someday you may want to implement an optimization that
decompresses small files totally in memory. Change:
FileInputStream streamCompressedInputStream(FileInputStream input),
FileInputStream streamCompressedFilename(String pathToFile), and
FileInputStream streamCompressedFile(File input) to return an
InputStream instead.

I also don't like that Compressor mixes compress and decompress
specific behaviors in one interface but its not so bad because there
are not properties that go with a behavior. All relevant information
is included in the method call parameters which is good.

In ArchiverType:
public static ArchiverType ZIP = ...; and public static ArchiverType
TAR = ; should both be final.

In AbstractCompressor:
The copy( final InputStream input, final OutputStream output ) method,
while useful, doesn't really belong in AbstractCompressor. It's just a
utility method that an implementation may need. IMO it doesn't warrant
being part of the exposed API. It's nice to aid in code reuse but I
don't think it should happen as the expense of sticking to one
purpose.

Since Java 1.2 the File class has had createTempFile methods which
makes the ones in AbstractCompressor unneeded and, while unlikely, it
is possible that createTempFileName() could return the same name in
two successive calls which would be bad.

In CompressorType:
public static CompressorType BZIP2 = ... ; also, final.

For both of ArchiveType and CompressorType add a valueOf(String)
method. This is what Java 1.5 Enums have and it lets you convert a
String, say from a config file, into an Enum. To do this right you'll
need to keep track of other Types with a Map or something in the
constructor.

In CompressException, DecompressException, PackException, and UnpackException:
The constructors that takes (String, Exception) call
this.setStackTrace( e.getStackTrace() ); If Java 1.4 is a minimum
requirement than loose that as it's misleading and use the initCause
method to do exception chaining.

packages org.apache.commons.compress.archivers.tar, and
org.apache.commons.compress.archivers.zip:

These packages have a minimally exposed API and I like that. This will
help make future improvements easier to implement.

package org.apache.commons.compress.compressors.bzip2:

Same comments as before about the Bzip2{In,Out}Streams which you have
in the TODO.txt file.


This draft is not perfect, as you can see in the todo list. But imho we
have a quite good base for future work, everything compiles and works
and is documented. So i would like to propose that someone is comitting
this, except you have reasons not to do this.
If you agree with me, i will open a bugzilla issue and add the link to
the code.


I feel like the public interface was noticeably more usable and has
less noise. IMO it's not yet a great API but it's a very decent one.

--
Sandy McArthur

He who dares not offend cannot be honest.
- Thomas Paine

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [compress] New draft 5

2006-04-30 Thread will pugh

Overall, I like the interfaces, but I've got a few issues:

0)  Update is mentioned in the Javadoc at the beginning of Archiver, but 
is not implemented.


1)  You often change method names based on the parameter types, e.g. 
Archiver.addFile + Archiver.addFileName, setUnpackDestinationName + 
setUnpackDestinationFile, etc.  It seems more conventional  and less 
chaotic to give all the methods the same name, and have them only differ 
based on parameter.  Examples of this style are constructors for 
java.utils.zip.ZipFile, java.io.FileInputStream, 
java.io.FileOutputStream, org.apache.commons.io.IOUtils.IOUtils.copy, 
org.apache.commons.io.FileUtils.isFileNewer, etc


2)  You can only add files to an archive, this makes it more difficult 
to add generated entries into an archive.  You need to actually write 
them to disk before including them in an archive


3)  UnpackDestination + Destination should be the same property (on the 
both the interface as well as underlying implementation), or you should 
split packing and unpacking into two different interfaces (or force 
folks to pass a path to pack + unpack, instead of setting properties on 
the object).


4)  None of your interfaces deal with the case of what to do if the 
destination file is already there.  Choices are either defining it in 
the interface, or adding a property defining what to do.  I would 
suggest the latter, and would suggest that for Archiver this method 
should take a FileFilter (since the unpack behaviour can be non-trivial) 
and should default to FalseFileFilter.INSTANCE.  For the Compresser 
interface a simple boolean is probably sufficient.  e.g.


   Archiver.setOverwriteFilter(TrueFileFilter.INSTANCE)
   Compresser.setOverwrite(true);

5)  In AbstractCompressor, I don't understand why you have a copy method 
and don't just use IOUtils.copy()
6)  In AbstractCompressor, I don't understand why you try to create your 
own temp name, rather than just letting File.createTempFile do it's thing.


   --Will

C. Grobmeier wrote:


-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Hello all,

here is a new draft for the compress interface:

* http://grobmeier.de/commons-compress-draft-5.zip

I have improved a lot of things, based on the comments of Sandy (thanks
for that).

This draft is not perfect, as you can see in the todo list. But imho we
have a quite good base for future work, everything compiles and works
and is documented. So i would like to propose that someone is comitting
this, except you have reasons not to do this.
If you agree with me, i will open a bugzilla issue and add the link to
the code.

- - Chris.
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.2.1 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFEVKMikv8rKBUE/T4RAlxxAKCMYzova2roWDA/skRyoDvFcErE2gCfTjFw
bT2VrGdR8Byt+VjsRo7Cyhw=
=1GRF
-END PGP SIGNATURE-

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

 



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [compress] New draft 5

2006-04-30 Thread will pugh
It occured to me later on, that I think this interface would be better 
if it were focused on describing an Archive, rather than trying to 
describe two processes (ie describing the noun rather than all the 
verbs).  The fall out from this, is that you don't have a member for an 
unpack directory, since that is only relevent to the unpacking process.  
and I think it makes the interface I went to mock this up and came up 
with a couple other issues with the interface.


 1)  Adding an entry or a file to an archive, you need to be able to 
set the entry name, otherwise you can run into problems with getting 
just the right relative path in the archive for the files.
 2)  It is pretty common to want to recursively add directories, we 
should provide this in the archive API


I've attached my mock-up so folks can see if they think these thoughts 
make sense.  I tried to borrow design and parameter conventions from 
some of the commons-io methods I use alot.


   --Will


will pugh wrote:


Overall, I like the interfaces, but I've got a few issues:

0)  Update is mentioned in the Javadoc at the beginning of Archiver, 
but is not implemented.


1)  You often change method names based on the parameter types, e.g. 
Archiver.addFile + Archiver.addFileName, setUnpackDestinationName + 
setUnpackDestinationFile, etc.  It seems more conventional  and less 
chaotic to give all the methods the same name, and have them only 
differ based on parameter.  Examples of this style are constructors 
for java.utils.zip.ZipFile, java.io.FileInputStream, 
java.io.FileOutputStream, org.apache.commons.io.IOUtils.IOUtils.copy, 
org.apache.commons.io.FileUtils.isFileNewer, etc


2)  You can only add files to an archive, this makes it more difficult 
to add generated entries into an archive.  You need to actually write 
them to disk before including them in an archive


3)  UnpackDestination + Destination should be the same property (on 
the both the interface as well as underlying implementation), or you 
should split packing and unpacking into two different interfaces (or 
force folks to pass a path to pack + unpack, instead of setting 
properties on the object).


4)  None of your interfaces deal with the case of what to do if the 
destination file is already there.  Choices are either defining it in 
the interface, or adding a property defining what to do.  I would 
suggest the latter, and would suggest that for Archiver this method 
should take a FileFilter (since the unpack behaviour can be 
non-trivial) and should default to FalseFileFilter.INSTANCE.  For the 
Compresser interface a simple boolean is probably sufficient.  e.g.


   Archiver.setOverwriteFilter(TrueFileFilter.INSTANCE)
   Compresser.setOverwrite(true);

5)  In AbstractCompressor, I don't understand why you have a copy 
method and don't just use IOUtils.copy()
6)  In AbstractCompressor, I don't understand why you try to create 
your own temp name, rather than just letting File.createTempFile do 
it's thing.


   --Will

C. Grobmeier wrote:


-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Hello all,

here is a new draft for the compress interface:

* http://grobmeier.de/commons-compress-draft-5.zip

I have improved a lot of things, based on the comments of Sandy (thanks
for that).

This draft is not perfect, as you can see in the todo list. But imho we
have a quite good base for future work, everything compiles and works
and is documented. So i would like to propose that someone is comitting
this, except you have reasons not to do this.
If you agree with me, i will open a bugzilla issue and add the link to
the code.

- - Chris.
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.2.1 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFEVKMikv8rKBUE/T4RAlxxAKCMYzova2roWDA/skRyoDvFcErE2gCfTjFw
bT2VrGdR8Byt+VjsRo7Cyhw=
=1GRF
-END PGP SIGNATURE-

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

 



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

package org.apache.commons.compress;

import java.io.*;
import java.util.Iterator;

/**
 * Archiver is an interface that defines a generic achive file.  This interface
 * allows a caller to inspect the entries in an archive, create new ones, add 
entries to existing ones
 * or to expand an archive.
 *
 *
 * @author christian.grobmeier
 */
public interface MyArchiver {

/**
 * If pack is called with this as the ioverwrite/i parameter, the method
 * will fail if the ArchiveFile exists.
 */
public static final int FAIL_ON_EXISTANCE = 0;

/**
 * If pack is called with this as the ioverwrite/i parameter, the method
 * will overwrite ArchiveFile.  The new archive will only contain the 
entries
 * added via iaddEntry/i.
 */
public