Re: [geometry] IO Modules - Part 2

2021-04-06 Thread Matt Juntunen
I have not received any responses on this so I am going to go ahead and merge 
the PR.
-Matt

From: Matt Juntunen 
Sent: Sunday, April 4, 2021 7:32 AM
To: Commons Developers List 
Subject: Re: [geometry] IO Modules - Part 2

Hello,

Would anyone like more time to look over this? If not, and assuming there are 
no objections, I plan on merging it in the next few days.

Regards,
Matt J

From: Matt Juntunen 
Sent: Sunday, March 28, 2021 1:39 PM
To: Commons Developers List 
Subject: [geometry] IO Modules - Part 2

Hello,

I've created a new PR with IO modules for commons-geometry [1]. This is a 
refactoring of an earlier PR I had and incorporates changes suggested by 
reviewers. Here are some of the main differences with the previous version:

  1.  Inputs and outputs are abstracted behind GeometryInput and GeometryOutput 
interfaces. The interface provides access to the input/output stream, the file 
name (if available), and a charset (if applicable and available).
  2.  Formats are represented with the GeometryFormat interface. This interface 
defines the format name and standard file extensions associated with the format.
  3.  There is now a GeometryFormat3D enum that implements GeometryFormat and 
provides easy access to the formats supported by the library.
  4.  The IO3D class provides static convenience methods for simple operations. 
If a path or url is passed with no specified format, the file extension is 
examined to determine the format.
  5.  I've added support for STL files (GEOMETRY-101).

Below is an example using the BoundaryManager3D class, which is the class users 
would use if they wanted to customize the default IO functionality.

DoublePrecisionContext precision = new 
EpsilonDoublePrecisionContext(1e-10);

// create some geometry to write
BoundarySource3D cube = Parallelepiped.unitCube(precision);

// create the manager and add handlers
BoundaryIOManager3D manager = new BoundaryIOManager3D();
manager.registerReadHandler(new StlBoundaryReadHandler3D());
manager.registerWriteHandler(new StlBoundaryWriteHandler3D());

Path file = Paths.get("target/cube.stl");

// write
manager.write(cube, new FileGeometryOutput(file), GeometryFormat3D.STL);

// read back
BoundarySource3D result = manager.read(new FileGeometryInput(file), 
GeometryFormat3D.STL, precision);


Here is the same example using the convenience methods in IO3D. The file format 
is inferred from the file name.

DoublePrecisionContext precision = new 
EpsilonDoublePrecisionContext(1e-10);

// create some geometry to write
BoundarySource3D cube = Parallelepiped.unitCube(precision);

Path file = Paths.get("target/cube.stl");

// write
IO3D.write(cube, file);

// read back
BoundarySource3D result = IO3D.read(file, precision);


If anyone has time to review the PR that would be most welcome.

Regards,
Matt J

[1] https://github.com/apache/commons-geometry/pull/141



Re: [geometry] IO Modules - Part 2

2021-04-04 Thread Matt Juntunen
Hello,

Would anyone like more time to look over this? If not, and assuming there are 
no objections, I plan on merging it in the next few days.

Regards,
Matt J

From: Matt Juntunen 
Sent: Sunday, March 28, 2021 1:39 PM
To: Commons Developers List 
Subject: [geometry] IO Modules - Part 2

Hello,

I've created a new PR with IO modules for commons-geometry [1]. This is a 
refactoring of an earlier PR I had and incorporates changes suggested by 
reviewers. Here are some of the main differences with the previous version:

  1.  Inputs and outputs are abstracted behind GeometryInput and GeometryOutput 
interfaces. The interface provides access to the input/output stream, the file 
name (if available), and a charset (if applicable and available).
  2.  Formats are represented with the GeometryFormat interface. This interface 
defines the format name and standard file extensions associated with the format.
  3.  There is now a GeometryFormat3D enum that implements GeometryFormat and 
provides easy access to the formats supported by the library.
  4.  The IO3D class provides static convenience methods for simple operations. 
If a path or url is passed with no specified format, the file extension is 
examined to determine the format.
  5.  I've added support for STL files (GEOMETRY-101).

Below is an example using the BoundaryManager3D class, which is the class users 
would use if they wanted to customize the default IO functionality.

DoublePrecisionContext precision = new 
EpsilonDoublePrecisionContext(1e-10);

// create some geometry to write
BoundarySource3D cube = Parallelepiped.unitCube(precision);

// create the manager and add handlers
BoundaryIOManager3D manager = new BoundaryIOManager3D();
manager.registerReadHandler(new StlBoundaryReadHandler3D());
manager.registerWriteHandler(new StlBoundaryWriteHandler3D());

Path file = Paths.get("target/cube.stl");

// write
manager.write(cube, new FileGeometryOutput(file), GeometryFormat3D.STL);

// read back
BoundarySource3D result = manager.read(new FileGeometryInput(file), 
GeometryFormat3D.STL, precision);


Here is the same example using the convenience methods in IO3D. The file format 
is inferred from the file name.

DoublePrecisionContext precision = new 
EpsilonDoublePrecisionContext(1e-10);

// create some geometry to write
BoundarySource3D cube = Parallelepiped.unitCube(precision);

Path file = Paths.get("target/cube.stl");

// write
IO3D.write(cube, file);

// read back
BoundarySource3D result = IO3D.read(file, precision);


If anyone has time to review the PR that would be most welcome.

Regards,
Matt J

[1] https://github.com/apache/commons-geometry/pull/141



Re: [geometry] IO Modules

2021-01-27 Thread Gilles Sadowski
Hello.

Le lun. 25 janv. 2021 à 13:40, Matt Juntunen
 a écrit :
>
> Hello,
>
> I have two main goals for the IO modules here:
>
>   1.  Provide a simple, high-level API (i.e. IO3D) for reading and writing 
> geometry with a minimum of fuss.

Sure.  But this high-level API looks like "syntactic sugar" that can
certainly be
done in a number of ways (with more or less code and/or more or less
flexibility).

>   2.  Provide a low-level, extensible API specific to each data format that 
> can be used to access addition format-specific information while reading and 
> provide greater control over the output while writing.
>
> So, there are actually two different APIs in question here. Users could use 
> the high-level API when only the geometry itself is of interest and the 
> low-level API when additional metadata is required. Useful examples of this 
> metadata are the object and group names from the OBJ format (which can be 
> used to store separate geometries in a single file) and the facet attribute 
> bytes in binary STL files (which are sometimes used to store color 
> information or other values). This information does not map directly to any 
> data structures in commons-geometry but it is certainly useful to be able to 
> access it (I will want to do so in my day job, for instance).

In effect, how would one map (application) data that is tied to a
(library) facet
instance?

>
> > Such customization could also be handled at the application level through
> a (handler-specific) property file.
>
> I'd rather not deal with configuration files and keep things simple and 
> lightweight.

Maybe I was not clear because I don't see how this configuration file(s)
make it less lightweight from a casual user's POV.  A default configuration
would be provided (that associates default extensions with the library-provided
handlers).  User could easily append extensions and handlers, rather than
having to do it in code.  It also seems like a feature that one could disable a
handler (rather that having code always loaded for a format which the user
doesn't actually want to use).

> > Then the case for the "enum" is moot (IIUC).
>
> Yes, it might be. I would like to allow format names to be mapped to more 
> than one file extension, though.

I don't understand how "enum" and multiple extensions are related...
My remark was about the "enum" usage in order to enforce  a single API
for file format (but I understood that your use-case requires more flexibility).

>
> > User-code should be in charge of associating input (e.g. file name) with 
> > how to handle it (e.g. the instantiation of the read handler).
>
> This would be the case for the low-level API, but I want the high-level API 
> to be able to handle this itself, based on its configuration. I want to be 
> able to call 'IO3D.read(Paths.get("cube.obj"))'

This would just work (the devil being in the details) with the hypothetical
"handlers.conf" file containing:
---CUT---
formats=OBJ
OBJ.extensions=obj,OBJ
OBJ.reader=org.apache.commons.geometry.io.euclidean.threed.ObjFormatReader
---CUT--

Regards,
Gilles

> just as I might call 'ImageIO.read(new File("image.png"))'.
>
> Regards,
> Matt J
>
> 
> From: Gilles Sadowski 
> Sent: Saturday, January 23, 2021 9:40 AM
> To: Commons Developers List 
> Subject: Re: [geometry] IO Modules
>
> Hi.
>
> Le ven. 22 janv. 2021 à 03:38, Matt Juntunen
>  a écrit :
> >
> > Hi Gilles,
> >
> > > Really, the main point is to separate format (contents) from filename 
> > > (container).
> >
> > This makes sense. What would you think of the approach below?
>
> I have no strong objections, as I do not graps all the requirements.
> [Maybe, IO-related stuff is always bound to be messy (cf. "java.io" vs
> "java.nio").]
>
> > This would separate the format name from the file extension(s) and provide 
> > an enum containing default format information and handlers. Usage of the 
> > enum would be optional since there would still be overloads that accept a 
> > simple format name string.
>
> It reminds me of a discussion concerning "Bloom filters", about identifiers
> for a hash function that could user-defined.
> IIRC, one idea (proposed by Alex) was to maintain a text file of (unique)
> identifiers.
>
> > For the BoundaryIOManager methods that accept a Path or URL, the format 
> > would still be determined by the file extension.
>
> I'm uncomfortable with having that kind of assumption in a low-level library
> (bad reminiscence of M$-DOS days).  User-code should be in charge of

Re: [geometry] IO Modules

2021-01-25 Thread Matt Juntunen
Hello,

I have two main goals for the IO modules here:

  1.  Provide a simple, high-level API (i.e. IO3D) for reading and writing 
geometry with a minimum of fuss.
  2.  Provide a low-level, extensible API specific to each data format that can 
be used to access addition format-specific information while reading and 
provide greater control over the output while writing.

So, there are actually two different APIs in question here. Users could use the 
high-level API when only the geometry itself is of interest and the low-level 
API when additional metadata is required. Useful examples of this metadata are 
the object and group names from the OBJ format (which can be used to store 
separate geometries in a single file) and the facet attribute bytes in binary 
STL files (which are sometimes used to store color information or other 
values). This information does not map directly to any data structures in 
commons-geometry but it is certainly useful to be able to access it (I will 
want to do so in my day job, for instance).

> Such customization could also be handled at the application level through
a (handler-specific) property file.

I'd rather not deal with configuration files and keep things simple and 
lightweight.

> Then the case for the "enum" is moot (IIUC).

Yes, it might be. I would like to allow format names to be mapped to more than 
one file extension, though.

> User-code should be in charge of associating input (e.g. file name) with how 
> to handle it (e.g. the instantiation of the read handler).

This would be the case for the low-level API, but I want the high-level API to 
be able to handle this itself, based on its configuration. I want to be able to 
call 'IO3D.read(Paths.get("cube.obj"))' just as I might call 'ImageIO.read(new 
File("image.png"))'.

Regards,
Matt J


From: Gilles Sadowski 
Sent: Saturday, January 23, 2021 9:40 AM
To: Commons Developers List 
Subject: Re: [geometry] IO Modules

Hi.

Le ven. 22 janv. 2021 à 03:38, Matt Juntunen
 a écrit :
>
> Hi Gilles,
>
> > Really, the main point is to separate format (contents) from filename 
> > (container).
>
> This makes sense. What would you think of the approach below?

I have no strong objections, as I do not graps all the requirements.
[Maybe, IO-related stuff is always bound to be messy (cf. "java.io" vs
"java.nio").]

> This would separate the format name from the file extension(s) and provide an 
> enum containing default format information and handlers. Usage of the enum 
> would be optional since there would still be overloads that accept a simple 
> format name string.

It reminds me of a discussion concerning "Bloom filters", about identifiers
for a hash function that could user-defined.
IIRC, one idea (proposed by Alex) was to maintain a text file of (unique)
identifiers.

> For the BoundaryIOManager methods that accept a Path or URL, the format would 
> still be determined by the file extension.

I'm uncomfortable with having that kind of assumption in a low-level library
(bad reminiscence of M$-DOS days).  User-code should be in charge of
associating input (e.g. file name) with how to handle it (e.g. the instantiation
of the read handler).

> If users want to use a non-standard file extension, they can open the IO 
> stream themselves and use the read/write methods that accept an IO stream and 
> format string name or Format instance.

What is "standard"/"non-standard"?  You use "txt", but the most standard
meaning of this extension is that the contents is ASCII-encoded...
And "csv" is also not sufficient to convery that contents is actually much
more constrained than a comma-separated list of strings.

Couldn't a file be used to define which read/writer the library should
instantiate, and to which extension it could be associated?

>
> interface Format {
> String getName();
> List getFileExtensions();
> }
>
> class BoundaryIOManager {
> void register(BoundaryFormat fmt, BoundaryReadHandler rh, 
> BoundaryWriteHandler wh) {
> register(fmt.getName(), fmt.getFileExtensions(), rh, wh);
> }
> void register(String formatName, List extensions, 
> BoundaryReadHandler rh, BoundaryWriteHandler wh) {...}
>
> // ...
>
> void write(BoundarySource src, OutputStream out, Format fmt) {
> write(src, in, fmt.getName());
> }
> void write(BoundarySource src, OutputStream out, String formatName) 
> {...}
>
> // similar read methods ...
> }
>
> enum StandardFormat3D implements Format {
> OBJ(...),
> TXT(...),
> CSV(...);
>
> public String getName() {...}
>

Re: [geometry] IO Modules

2021-01-23 Thread Gilles Sadowski
Hi.

Le ven. 22 janv. 2021 à 03:38, Matt Juntunen
 a écrit :
>
> Hi Gilles,
>
> > Really, the main point is to separate format (contents) from filename 
> > (container).
>
> This makes sense. What would you think of the approach below?

I have no strong objections, as I do not graps all the requirements.
[Maybe, IO-related stuff is always bound to be messy (cf. "java.io" vs
"java.nio").]

> This would separate the format name from the file extension(s) and provide an 
> enum containing default format information and handlers. Usage of the enum 
> would be optional since there would still be overloads that accept a simple 
> format name string.

It reminds me of a discussion concerning "Bloom filters", about identifiers
for a hash function that could user-defined.
IIRC, one idea (proposed by Alex) was to maintain a text file of (unique)
identifiers.

> For the BoundaryIOManager methods that accept a Path or URL, the format would 
> still be determined by the file extension.

I'm uncomfortable with having that kind of assumption in a low-level library
(bad reminiscence of M$-DOS days).  User-code should be in charge of
associating input (e.g. file name) with how to handle it (e.g. the instantiation
of the read handler).

> If users want to use a non-standard file extension, they can open the IO 
> stream themselves and use the read/write methods that accept an IO stream and 
> format string name or Format instance.

What is "standard"/"non-standard"?  You use "txt", but the most standard
meaning of this extension is that the contents is ASCII-encoded...
And "csv" is also not sufficient to convery that contents is actually much
more constrained than a comma-separated list of strings.

Couldn't a file be used to define which read/writer the library should
instantiate, and to which extension it could be associated?

>
> interface Format {
> String getName();
> List getFileExtensions();
> }
>
> class BoundaryIOManager {
> void register(BoundaryFormat fmt, BoundaryReadHandler rh, 
> BoundaryWriteHandler wh) {
> register(fmt.getName(), fmt.getFileExtensions(), rh, wh);
> }
> void register(String formatName, List extensions, 
> BoundaryReadHandler rh, BoundaryWriteHandler wh) {...}
>
> // ...
>
> void write(BoundarySource src, OutputStream out, Format fmt) {
> write(src, in, fmt.getName());
> }
> void write(BoundarySource src, OutputStream out, String formatName) 
> {...}
>
> // similar read methods ...
> }
>
> enum StandardFormat3D implements Format {
> OBJ(...),
> TXT(...),
> CSV(...);
>
> public String getName() {...}
> public List getFileExtensions() {...}
> public BoundaryReadHandler3D readHandler() { (execute a supplier 
> function)... }
> public BoundaryWriteHandler3D writeHandler() { (execute a supplier 
> function)... }
> }
>
> > The "enum" is for natively supported formats to allow for simple API, while 
> > "hiding" the actual implementations (as in "RandomSource" from "Commons 
> > RNG").
>
> I'd prefer to not hide the format-specific classes, at least not completely.

Then the case for the "enum" is moot (IIUC).

> For example, the OBJ file format can contain a lot more information than just 
> pure geometry, such as object names (more than one geometry can be contained 
> in a single file), material information (for use in rendering), free-form 
> curve definitions, etc. This information is not used to produce 
> BoundarySource3D or Mesh instances but it can be accessed easily by extending 
> AbstractOBJParser or PolygonOBJParser. Also, additional information such as 
> comments and object names can be included in output files if the OBJWriter 
> class is used directly, as opposed to IO3D or BoundaryIOManager3D. It seems 
> like a waste to completely hide this functionality.

I agree to not waste functionality.  But how is the additional contents
handled currently?  It seems that it simply discarded, and someone
wanting to retrieve it would then discard the current functionality that
only return a "BoundarySource3D".
Sorry if I'm missing something because of my not having read the code
but this makes me think that a parser generator would have allowed
for extending the support of a given format.

>
> Another reason to keep these classes public is that they may need to be 
> accessed in order to configure them. For example, the txt, csv, and obj 
> formats use a default format pattern for writing floating point numbers as 
> text. If this needs to be modified, for example to increase or decrease the 
> number of minimum fraction digits, then the format-specific type will need to 
> be accessed. The code below shows how to set a custom decimal format for OBJ 
> files (using the current code).
>
> OBJBoundaryWriteHandler3D wh = new OBJBoundaryWriteHandler3D();
> wh.setDecimalFormatPattern("0.0##");

Such customizati

Re: [geometry] IO Modules

2021-01-21 Thread Matt Juntunen
Hi Gilles,

> Really, the main point is to separate format (contents) from filename 
> (container).

This makes sense. What would you think of the approach below? This would 
separate the format name from the file extension(s) and provide an enum 
containing default format information and handlers. Usage of the enum would be 
optional since there would still be overloads that accept a simple format name 
string. For the BoundaryIOManager methods that accept a Path or URL, the format 
would still be determined by the file extension. If users want to use a 
non-standard file extension, they can open the IO stream themselves and use the 
read/write methods that accept an IO stream and format string name or Format 
instance.

interface Format {
String getName();
List getFileExtensions();
}

class BoundaryIOManager {
void register(BoundaryFormat fmt, BoundaryReadHandler rh, 
BoundaryWriteHandler wh) {
register(fmt.getName(), fmt.getFileExtensions(), rh, wh);
}
void register(String formatName, List extensions, 
BoundaryReadHandler rh, BoundaryWriteHandler wh) {...}

// ...

void write(BoundarySource src, OutputStream out, Format fmt) {
write(src, in, fmt.getName());
}
void write(BoundarySource src, OutputStream out, String formatName) 
{...}

// similar read methods ...
}

enum StandardFormat3D implements Format {
OBJ(...),
TXT(...),
CSV(...);

public String getName() {...}
public List getFileExtensions() {...}
public BoundaryReadHandler3D readHandler() { (execute a supplier 
function)... }
public BoundaryWriteHandler3D writeHandler() { (execute a supplier 
function)... }
}

> The "enum" is for natively supported formats to allow for simple API, while 
> "hiding" the actual implementations (as in "RandomSource" from "Commons RNG").

I'd prefer to not hide the format-specific classes, at least not completely. 
For example, the OBJ file format can contain a lot more information than just 
pure geometry, such as object names (more than one geometry can be contained in 
a single file), material information (for use in rendering), free-form curve 
definitions, etc. This information is not used to produce BoundarySource3D or 
Mesh instances but it can be accessed easily by extending AbstractOBJParser or 
PolygonOBJParser. Also, additional information such as comments and object 
names can be included in output files if the OBJWriter class is used directly, 
as opposed to IO3D or BoundaryIOManager3D. It seems like a waste to completely 
hide this functionality.

Another reason to keep these classes public is that they may need to be 
accessed in order to configure them. For example, the txt, csv, and obj formats 
use a default format pattern for writing floating point numbers as text. If 
this needs to be modified, for example to increase or decrease the number of 
minimum fraction digits, then the format-specific type will need to be 
accessed. The code below shows how to set a custom decimal format for OBJ files 
(using the current code).

OBJBoundaryWriteHandler3D wh = new OBJBoundaryWriteHandler3D();
wh.setDecimalFormatPattern("0.0##");

IO3D.getDefaultManager().registerWriteHandler("obj", wh);

One additional question that I thought of while looking at your example code: 
what is our convention for class names that contain acronyms or other sequences 
of capitalized letters? In other words, should it be OBJWriter or ObjWriter?

Regards,
Matt J


From: Gilles Sadowski 
Sent: Thursday, January 21, 2021 7:40 AM
To: Commons Developers List 
Subject: Re: [geometry] IO Modules

Hello.

Le mer. 20 janv. 2021 à 23:55, Matt Juntunen
 a écrit :
>
> Hi Gilles,
>
> I've updated the PR with the new module/package names.
>
> > I don't see the link between "(not) extensible" and "enum": Extensibility 
> > is provided by API (which classes are public and meant to be reused, e.g. 
> > by custom IO code) while the "enum" defines the file formats that are 
> > "natively" supported by this library.
>
> I might not be picturing this correctly so perhaps a code example would help. 
> Here is what I'd like to be able to do with the IO code:
>
> // add custom handler for my own file format
> IO3D.getDefaultManager().registerReadHandler("matt", new 
> MattFileReadHandler());
>
> // read a file using that format
> BoundarySource3D result = IO3D.read(Paths.get("mygeometry.matt"));
>
> I don't see how the above could be accomplished if the supported formats are 
> in an enum.

The "enum" is for natively supported formats to allow for simple API, while
"hiding&

Re: [geometry] IO Modules

2021-01-21 Thread Gilles Sadowski
Hello.

Le mer. 20 janv. 2021 à 23:55, Matt Juntunen
 a écrit :
>
> Hi Gilles,
>
> I've updated the PR with the new module/package names.
>
> > I don't see the link between "(not) extensible" and "enum": Extensibility 
> > is provided by API (which classes are public and meant to be reused, e.g. 
> > by custom IO code) while the "enum" defines the file formats that are 
> > "natively" supported by this library.
>
> I might not be picturing this correctly so perhaps a code example would help. 
> Here is what I'd like to be able to do with the IO code:
>
> // add custom handler for my own file format
> IO3D.getDefaultManager().registerReadHandler("matt", new 
> MattFileReadHandler());
>
> // read a file using that format
> BoundarySource3D result = IO3D.read(Paths.get("mygeometry.matt"));
>
> I don't see how the above could be accomplished if the supported formats are 
> in an enum.

The "enum" is for natively supported formats to allow for simple API, while
"hiding" the actual implementations (as in "RandomSource" from "Commons
RNG").
Really, the main point is to separate format (contents) from filename
(container).

// Library code.

/** Native file formats. */
enum FileFormat {
// "ObjFileReadHandler" and "ObjFileWriteHandler" must be "internal".
OBJ(new ObjFileReadHandler(), new ObjFileWriteHandler());

private final FileReadHandler rh;
private final FileReadHandler wh;

FileFormat(FileReadHandler rh,
  FileWriteHandle wh) {
this.rh = rh;
this.wh = wh;
}

// Methods to support "register" and "registerSuffix" (see below).
}

// Method does not exist yet.
IO3D.getDefaultManager().register(FileFormat.OBJ);

// In user code

// Choose format, and file name (arbitrary).
IO3D.write(src, filename, FileFormat.OBJ);

// Associate user-preferred suffix(es) to a natively supported file format...
IO3D.getDefaultManager().registerSuffix(FileFormat.OBJ, ".obj",
".OBJ"); // Method does not exist yet.
// .. and the library will add the default suffix (".obj").
IO3D.write(src, filenamePrefix, FileFormat.OBJ);

// User-defined handlers.
final String id = IO3D.getDefaultManager().register(new
MattFileReadHandler(), new MattFileWriteHandler()); // Method does not
exist yet.
IO3D.getDefaultManager().registerSuffix(id, "matt");

> Also, the file extension approach above is similar to that used by 
> javax.imageio.ImageIO.

IIUC, they also make the difference pointed out above:
https://docs.oracle.com/javase/7/docs/api/javax/imageio/ImageIO.html#getImageReadersByMIMEType(java.lang.String)
https://docs.oracle.com/javase/7/docs/api/javax/imageio/ImageIO.html#getImageReadersBySuffix(java.lang.String)

Sorry if I've cut some corners...

Regards,
Gilles

>
> Regards,
> Matt
>
> 
> From: Gilles Sadowski 
> Sent: Wednesday, January 20, 2021 5:28 PM
> To: Commons Developers List 
> Subject: Re: [geometry] IO Modules
>
> Le mer. 20 janv. 2021 à 22:42, Matt Juntunen
>  a écrit :
> >
> > Hi Gilles,
> >
> > > package "io" does not belong in "core".
> >
> > Ah, yes. I'd like to keep the core and euclidean IO modules separate since 
> > I'm picturing a spherical IO module later on down the road. In that case, 
> > perhaps we could just put the "io" portion of the module/package name 
> > before everything else? That would give us
> >
> > commons-geometry-io-core - contains package o.a.c.geometry.io.core
> > commons-geometry-io-euclidean - contains package 
> > o.a.c.geometry.io.euclidean
>
> Yes; having "io" as the top-level package is what I was suggesting.
>
> >
> > > Then, we had also discussed that it would have been more robust to use a 
> > > parser generator...
> >
> > Yes. I looked into that and ended up deciding that a full parser-generator 
> > would be overkill for what I wanted to do and possibly more work to 
> > maintain/customize. I've placed the current low-level parsing code 
> > (consisting of the CharReadBuffer and SimpleTextParser classes) in the 
> > "internal" package o.a.c.geometry.core.io.internal. I don't intend for 
> > these classes to be part of the public API, which would allow us to switch 
> > to a parser-generator later if needed.
>
> OK!
>
> >
> > > I haven't read all the code (sorry!)
> >
> > No worries. There is a lot of it :-)
> >
> > > I'm not convinced by the usage of file "exten

Re: [geometry] IO Modules

2021-01-20 Thread Matt Juntunen
Hi Gilles,

I've updated the PR with the new module/package names.

> I don't see the link between "(not) extensible" and "enum": Extensibility is 
> provided by API (which classes are public and meant to be reused, e.g. by 
> custom IO code) while the "enum" defines the file formats that are "natively" 
> supported by this library.

I might not be picturing this correctly so perhaps a code example would help. 
Here is what I'd like to be able to do with the IO code:

// add custom handler for my own file format
IO3D.getDefaultManager().registerReadHandler("matt", new MattFileReadHandler());

// read a file using that format
BoundarySource3D result = IO3D.read(Paths.get("mygeometry.matt"));

I don't see how the above could be accomplished if the supported formats are in 
an enum. Also, the file extension approach above is similar to that used by 
javax.imageio.ImageIO.

Regards,
Matt


From: Gilles Sadowski 
Sent: Wednesday, January 20, 2021 5:28 PM
To: Commons Developers List 
Subject: Re: [geometry] IO Modules

Le mer. 20 janv. 2021 à 22:42, Matt Juntunen
 a écrit :
>
> Hi Gilles,
>
> > package "io" does not belong in "core".
>
> Ah, yes. I'd like to keep the core and euclidean IO modules separate since 
> I'm picturing a spherical IO module later on down the road. In that case, 
> perhaps we could just put the "io" portion of the module/package name before 
> everything else? That would give us
>
> commons-geometry-io-core - contains package o.a.c.geometry.io.core
> commons-geometry-io-euclidean - contains package 
> o.a.c.geometry.io.euclidean

Yes; having "io" as the top-level package is what I was suggesting.

>
> > Then, we had also discussed that it would have been more robust to use a 
> > parser generator...
>
> Yes. I looked into that and ended up deciding that a full parser-generator 
> would be overkill for what I wanted to do and possibly more work to 
> maintain/customize. I've placed the current low-level parsing code 
> (consisting of the CharReadBuffer and SimpleTextParser classes) in the 
> "internal" package o.a.c.geometry.core.io.internal. I don't intend for these 
> classes to be part of the public API, which would allow us to switch to a 
> parser-generator later if needed.

OK!

>
> > I haven't read all the code (sorry!)
>
> No worries. There is a lot of it :-)
>
> > I'm not convinced by the usage of file "extensions" to determine how the 
> > contents should be parsed.  At first sight, an "enum"-based approach[3] 
> > would be more flexible and clearer from a code design POV.
>
> In contrast to the rest of the library, I'd like to make these IO modules as 
> extensible as possible, the reason being that they form the main connection 
> between the library and external systems. If we use an enum to indicate the 
> format, we restrict usage of the API to only those formats. A string name, on 
> the other hand, allows users to define their own formats and use them with 
> the API.

[I'm only answering based on the impression I get from your description above.]
I don't see the link between "(not) extensible" and "enum":
Extensibility is provided by
API (which classes are public and meant to be reused, e.g. by custom
IO code) while
the "enum" defines the file formats that are "natively" supported by
this library.
[And, in time, custom code can become part of the supported formats and get its
corresponding "enum" element.]
The file extensions could be supported too, but I would think at a higher level;
mapping to either the "enum", or to custom code.

> For example, we will soon be creating a custom geometry format for an 
> application at my day job and I would like to be able to use these IO modules 
> to read and write that format (using custom BoundaryReadHandler3D and 
> BoundaryWriteHandler3D implementations).

Is what I wrote above preventing that?

Best,
Gilles

>
> Thoughts?
>
> Regards,
> Matt J
>
>
>
> 
> From: Gilles Sadowski 
> Sent: Wednesday, January 20, 2021 9:17 AM
> To: Commons Developers List 
> Subject: Re: [geometry] IO Modules
>
> Hi Matt.
>
> Le mer. 20 janv. 2021 à 05:03, Matt Juntunen
>  a écrit :
> >
> > Hello,
> >
> > I've created GEOMETRY-115 and an associated PR [1] containing new modules 
> > for IO functionality. The new modules are
> >
> >   *   commons-geometry-core-io - Common space-independent interfaces and 
> > classes
> >   *   commons-geometry-euclidean-i

Re: [geometry] IO Modules

2021-01-20 Thread Gilles Sadowski
Le mer. 20 janv. 2021 à 22:42, Matt Juntunen
 a écrit :
>
> Hi Gilles,
>
> > package "io" does not belong in "core".
>
> Ah, yes. I'd like to keep the core and euclidean IO modules separate since 
> I'm picturing a spherical IO module later on down the road. In that case, 
> perhaps we could just put the "io" portion of the module/package name before 
> everything else? That would give us
>
> commons-geometry-io-core - contains package o.a.c.geometry.io.core
> commons-geometry-io-euclidean - contains package 
> o.a.c.geometry.io.euclidean

Yes; having "io" as the top-level package is what I was suggesting.

>
> > Then, we had also discussed that it would have been more robust to use a 
> > parser generator...
>
> Yes. I looked into that and ended up deciding that a full parser-generator 
> would be overkill for what I wanted to do and possibly more work to 
> maintain/customize. I've placed the current low-level parsing code 
> (consisting of the CharReadBuffer and SimpleTextParser classes) in the 
> "internal" package o.a.c.geometry.core.io.internal. I don't intend for these 
> classes to be part of the public API, which would allow us to switch to a 
> parser-generator later if needed.

OK!

>
> > I haven't read all the code (sorry!)
>
> No worries. There is a lot of it :-)
>
> > I'm not convinced by the usage of file "extensions" to determine how the 
> > contents should be parsed.  At first sight, an "enum"-based approach[3] 
> > would be more flexible and clearer from a code design POV.
>
> In contrast to the rest of the library, I'd like to make these IO modules as 
> extensible as possible, the reason being that they form the main connection 
> between the library and external systems. If we use an enum to indicate the 
> format, we restrict usage of the API to only those formats. A string name, on 
> the other hand, allows users to define their own formats and use them with 
> the API.

[I'm only answering based on the impression I get from your description above.]
I don't see the link between "(not) extensible" and "enum":
Extensibility is provided by
API (which classes are public and meant to be reused, e.g. by custom
IO code) while
the "enum" defines the file formats that are "natively" supported by
this library.
[And, in time, custom code can become part of the supported formats and get its
corresponding "enum" element.]
The file extensions could be supported too, but I would think at a higher level;
mapping to either the "enum", or to custom code.

> For example, we will soon be creating a custom geometry format for an 
> application at my day job and I would like to be able to use these IO modules 
> to read and write that format (using custom BoundaryReadHandler3D and 
> BoundaryWriteHandler3D implementations).

Is what I wrote above preventing that?

Best,
Gilles

>
> Thoughts?
>
> Regards,
> Matt J
>
>
>
> 
> From: Gilles Sadowski 
> Sent: Wednesday, January 20, 2021 9:17 AM
> To: Commons Developers List 
> Subject: Re: [geometry] IO Modules
>
> Hi Matt.
>
> Le mer. 20 janv. 2021 à 05:03, Matt Juntunen
>  a écrit :
> >
> > Hello,
> >
> > I've created GEOMETRY-115 and an associated PR [1] containing new modules 
> > for IO functionality. The new modules are
> >
> >   *   commons-geometry-core-io - Common space-independent interfaces and 
> > classes
> >   *   commons-geometry-euclidean-io - Euclidean IO classes; currently 
> > contains support for the 3D formats TXT, CSV, and OBJ
> >
> > The API is based on a core BoundaryIOManager class that delegates to 
> > BoundaryReadHandler and BoundaryWriteHandler implementations based on the 
> > requested data format. For Euclidean 3D space, a convenience IO3D class is 
> > provided with static methods that delegate to a default manager instance. 
> > In addition to reading and writing the core geometric types for the library 
> > (ConvexPolygon3D, Triangle3D), the Euclidean module also supports reading 
> > and writing a FacetDefinition interface, which exposes simple, unvalidated 
> > geometric data. This is intended for accessing raw (possibly invalid) 
> > geometric data from files and writing data contained in external data 
> > structures (for example, a custom facet class). The example below is from 
> > the IO3D class documentation and demonstrates a read, transform, write 
> > operation using streams.
> >
> > final Path origFile = tempDir.resolve("orig.obj&quo

Re: [geometry] IO Modules

2021-01-20 Thread Matt Juntunen
Hi Gilles,

> package "io" does not belong in "core".

Ah, yes. I'd like to keep the core and euclidean IO modules separate since I'm 
picturing a spherical IO module later on down the road. In that case, perhaps 
we could just put the "io" portion of the module/package name before everything 
else? That would give us

commons-geometry-io-core - contains package o.a.c.geometry.io.core
commons-geometry-io-euclidean - contains package o.a.c.geometry.io.euclidean

> Then, we had also discussed that it would have been more robust to use a 
> parser generator...

Yes. I looked into that and ended up deciding that a full parser-generator 
would be overkill for what I wanted to do and possibly more work to 
maintain/customize. I've placed the current low-level parsing code (consisting 
of the CharReadBuffer and SimpleTextParser classes) in the "internal" package 
o.a.c.geometry.core.io.internal. I don't intend for these classes to be part of 
the public API, which would allow us to switch to a parser-generator later if 
needed.

> I haven't read all the code (sorry!)

No worries. There is a lot of it :-)

> I'm not convinced by the usage of file "extensions" to determine how the 
> contents should be parsed.  At first sight, an "enum"-based approach[3] would 
> be more flexible and clearer from a code design POV.

In contrast to the rest of the library, I'd like to make these IO modules as 
extensible as possible, the reason being that they form the main connection 
between the library and external systems. If we use an enum to indicate the 
format, we restrict usage of the API to only those formats. A string name, on 
the other hand, allows users to define their own formats and use them with the 
API. For example, we will soon be creating a custom geometry format for an 
application at my day job and I would like to be able to use these IO modules 
to read and write that format (using custom BoundaryReadHandler3D and 
BoundaryWriteHandler3D implementations).

Thoughts?

Regards,
Matt J



________
From: Gilles Sadowski 
Sent: Wednesday, January 20, 2021 9:17 AM
To: Commons Developers List 
Subject: Re: [geometry] IO Modules

Hi Matt.

Le mer. 20 janv. 2021 à 05:03, Matt Juntunen
 a écrit :
>
> Hello,
>
> I've created GEOMETRY-115 and an associated PR [1] containing new modules for 
> IO functionality. The new modules are
>
>   *   commons-geometry-core-io - Common space-independent interfaces and 
> classes
>   *   commons-geometry-euclidean-io - Euclidean IO classes; currently 
> contains support for the 3D formats TXT, CSV, and OBJ
>
> The API is based on a core BoundaryIOManager class that delegates to 
> BoundaryReadHandler and BoundaryWriteHandler implementations based on the 
> requested data format. For Euclidean 3D space, a convenience IO3D class is 
> provided with static methods that delegate to a default manager instance. In 
> addition to reading and writing the core geometric types for the library 
> (ConvexPolygon3D, Triangle3D), the Euclidean module also supports reading and 
> writing a FacetDefinition interface, which exposes simple, unvalidated 
> geometric data. This is intended for accessing raw (possibly invalid) 
> geometric data from files and writing data contained in external data 
> structures (for example, a custom facet class). The example below is from the 
> IO3D class documentation and demonstrates a read, transform, write operation 
> using streams.
>
> final Path origFile = tempDir.resolve("orig.obj");
> final Path scaledFile = tempDir.resolve("scaled.csv");
>
> final DoublePrecisionContext precision = new 
> EpsilonDoublePrecisionContext(1e-10);
> final BoundarySource3D src = Parallelepiped.unitCube(precision);
>
> IO3D.write(src, origFile);
>
> final AffineTransformMatrix3D transform = 
> AffineTransformMatrix3D.createScale(2);
>
> try (Stream stream = IO3D.triangles(origFile, precision)) 
> {
> IO3D.write(stream.map(t -> t.transform(transform)), scaledFile);
> }
>
> Feedback is welcome.

The high-level API, illustrated in the example above, looks quite fine.
But we should enforce "separation of concerns"; in this particular case:
package "io" does not belong in "core".

The Java package
  org.apache.commons.geometry.core
is defined the within the (maven) module
  commons-geometry-core
The new Java package
  org.apache.commons.geometry.core.io
is defined the within the (maven) module
  commons-geometry-core-io

The (maven) module name is also often used to define the
Java 9+ module name for JPMS.[1]
IIUC, the package
  org.apache.commons.geometry.core
and its "sub-pa

Re: [geometry] IO Modules

2021-01-20 Thread Gilles Sadowski
Hi Matt.

Le mer. 20 janv. 2021 à 05:03, Matt Juntunen
 a écrit :
>
> Hello,
>
> I've created GEOMETRY-115 and an associated PR [1] containing new modules for 
> IO functionality. The new modules are
>
>   *   commons-geometry-core-io - Common space-independent interfaces and 
> classes
>   *   commons-geometry-euclidean-io - Euclidean IO classes; currently 
> contains support for the 3D formats TXT, CSV, and OBJ
>
> The API is based on a core BoundaryIOManager class that delegates to 
> BoundaryReadHandler and BoundaryWriteHandler implementations based on the 
> requested data format. For Euclidean 3D space, a convenience IO3D class is 
> provided with static methods that delegate to a default manager instance. In 
> addition to reading and writing the core geometric types for the library 
> (ConvexPolygon3D, Triangle3D), the Euclidean module also supports reading and 
> writing a FacetDefinition interface, which exposes simple, unvalidated 
> geometric data. This is intended for accessing raw (possibly invalid) 
> geometric data from files and writing data contained in external data 
> structures (for example, a custom facet class). The example below is from the 
> IO3D class documentation and demonstrates a read, transform, write operation 
> using streams.
>
> final Path origFile = tempDir.resolve("orig.obj");
> final Path scaledFile = tempDir.resolve("scaled.csv");
>
> final DoublePrecisionContext precision = new 
> EpsilonDoublePrecisionContext(1e-10);
> final BoundarySource3D src = Parallelepiped.unitCube(precision);
>
> IO3D.write(src, origFile);
>
> final AffineTransformMatrix3D transform = 
> AffineTransformMatrix3D.createScale(2);
>
> try (Stream stream = IO3D.triangles(origFile, precision)) 
> {
> IO3D.write(stream.map(t -> t.transform(transform)), scaledFile);
> }
>
> Feedback is welcome.

The high-level API, illustrated in the example above, looks quite fine.
But we should enforce "separation of concerns"; in this particular case:
package "io" does not belong in "core".

The Java package
  org.apache.commons.geometry.core
is defined the within the (maven) module
  commons-geometry-core
The new Java package
  org.apache.commons.geometry.core.io
is defined the within the (maven) module
  commons-geometry-core-io

The (maven) module name is also often used to define the
Java 9+ module name for JPMS.[1]
IIUC, the package
  org.apache.commons.geometry.core
and its "sub-package"
  org.apache.commons.geometry.core.io
cannot belong to different (JPMS) modules, because (IIUC) a package
hierarchy cannot be shared across modules (since one of the purposes
is to organize "visibility").
Components should be JPMS-compatible.[2]

Fixing that would just require to
1. remove the two modules introduced in PR #130
  commons-geometry-core-io
  commons-geometry-euclidean-io
2. create one (maven) module
  commons-geometry-io
owning the Java packages
  org.apache.commons.geometry.io
  org.apache.commons.geometry.io.core
  org.apache.commons.geometry.io.euclidean

[If you think that it's really worth having separate modules for loading
"euclidean" data vs other space-types (not yet implemented IIUC),
point (2) would be adapted accordingly.]

This would fix both the JPMS issue, and the design issue.

Then, we had also discussed that it would have been more robust to
use a parser generator...
In either case, I'd suggest that we put the actual I/O classes in an
"internal" package, and only advertize the above high-level API (i.e. we
should not be expected to maintain compatibility of the parser code).

I haven't read all the code (sorry!), but in class "IO3D", I'm not convinced
by the usage of file "extensions" to determine how the contents should be
parsed.  At first sight, an "enum"-based approach[3] would be more flexible
and clearer from a code design POV.

Regards,
Gilles

[1] https://www.oracle.com/corporate/features/understanding-java-9-modules.html
[2] See 
https://gitbox.apache.org/repos/asf?p=commons-rng.git;a=tree;f=commons-rng-examples/examples-jpms;hb=HEAD
[3] See 
https://gitbox.apache.org/repos/asf?p=commons-rng.git;a=blob;f=commons-rng-simple/src/main/java/org/apache/commons/rng/simple/RandomSource.java;h=65ca02fdc285fbb7ea8305008dbce21f571191d7;hb=HEAD

>
> Regards,
> Matt J
>
> [1] https://github.com/apache/commons-geometry/pull/130

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org