Re: [geometry] IO Modules - Part 2
I have not received any responses on this so I am going to go ahead and merge the PR. -Matt From: Matt Juntunen Sent: Sunday, April 4, 2021 7:32 AM To: Commons Developers List Subject: Re: [geometry] IO Modules - Part 2 Hello, Would anyone like more time to look over this? If not, and assuming there are no objections, I plan on merging it in the next few days. Regards, Matt J From: Matt Juntunen Sent: Sunday, March 28, 2021 1:39 PM To: Commons Developers List Subject: [geometry] IO Modules - Part 2 Hello, I've created a new PR with IO modules for commons-geometry [1]. This is a refactoring of an earlier PR I had and incorporates changes suggested by reviewers. Here are some of the main differences with the previous version: 1. Inputs and outputs are abstracted behind GeometryInput and GeometryOutput interfaces. The interface provides access to the input/output stream, the file name (if available), and a charset (if applicable and available). 2. Formats are represented with the GeometryFormat interface. This interface defines the format name and standard file extensions associated with the format. 3. There is now a GeometryFormat3D enum that implements GeometryFormat and provides easy access to the formats supported by the library. 4. The IO3D class provides static convenience methods for simple operations. If a path or url is passed with no specified format, the file extension is examined to determine the format. 5. I've added support for STL files (GEOMETRY-101). Below is an example using the BoundaryManager3D class, which is the class users would use if they wanted to customize the default IO functionality. DoublePrecisionContext precision = new EpsilonDoublePrecisionContext(1e-10); // create some geometry to write BoundarySource3D cube = Parallelepiped.unitCube(precision); // create the manager and add handlers BoundaryIOManager3D manager = new BoundaryIOManager3D(); manager.registerReadHandler(new StlBoundaryReadHandler3D()); manager.registerWriteHandler(new StlBoundaryWriteHandler3D()); Path file = Paths.get("target/cube.stl"); // write manager.write(cube, new FileGeometryOutput(file), GeometryFormat3D.STL); // read back BoundarySource3D result = manager.read(new FileGeometryInput(file), GeometryFormat3D.STL, precision); Here is the same example using the convenience methods in IO3D. The file format is inferred from the file name. DoublePrecisionContext precision = new EpsilonDoublePrecisionContext(1e-10); // create some geometry to write BoundarySource3D cube = Parallelepiped.unitCube(precision); Path file = Paths.get("target/cube.stl"); // write IO3D.write(cube, file); // read back BoundarySource3D result = IO3D.read(file, precision); If anyone has time to review the PR that would be most welcome. Regards, Matt J [1] https://github.com/apache/commons-geometry/pull/141
Re: [geometry] IO Modules - Part 2
Hello, Would anyone like more time to look over this? If not, and assuming there are no objections, I plan on merging it in the next few days. Regards, Matt J From: Matt Juntunen Sent: Sunday, March 28, 2021 1:39 PM To: Commons Developers List Subject: [geometry] IO Modules - Part 2 Hello, I've created a new PR with IO modules for commons-geometry [1]. This is a refactoring of an earlier PR I had and incorporates changes suggested by reviewers. Here are some of the main differences with the previous version: 1. Inputs and outputs are abstracted behind GeometryInput and GeometryOutput interfaces. The interface provides access to the input/output stream, the file name (if available), and a charset (if applicable and available). 2. Formats are represented with the GeometryFormat interface. This interface defines the format name and standard file extensions associated with the format. 3. There is now a GeometryFormat3D enum that implements GeometryFormat and provides easy access to the formats supported by the library. 4. The IO3D class provides static convenience methods for simple operations. If a path or url is passed with no specified format, the file extension is examined to determine the format. 5. I've added support for STL files (GEOMETRY-101). Below is an example using the BoundaryManager3D class, which is the class users would use if they wanted to customize the default IO functionality. DoublePrecisionContext precision = new EpsilonDoublePrecisionContext(1e-10); // create some geometry to write BoundarySource3D cube = Parallelepiped.unitCube(precision); // create the manager and add handlers BoundaryIOManager3D manager = new BoundaryIOManager3D(); manager.registerReadHandler(new StlBoundaryReadHandler3D()); manager.registerWriteHandler(new StlBoundaryWriteHandler3D()); Path file = Paths.get("target/cube.stl"); // write manager.write(cube, new FileGeometryOutput(file), GeometryFormat3D.STL); // read back BoundarySource3D result = manager.read(new FileGeometryInput(file), GeometryFormat3D.STL, precision); Here is the same example using the convenience methods in IO3D. The file format is inferred from the file name. DoublePrecisionContext precision = new EpsilonDoublePrecisionContext(1e-10); // create some geometry to write BoundarySource3D cube = Parallelepiped.unitCube(precision); Path file = Paths.get("target/cube.stl"); // write IO3D.write(cube, file); // read back BoundarySource3D result = IO3D.read(file, precision); If anyone has time to review the PR that would be most welcome. Regards, Matt J [1] https://github.com/apache/commons-geometry/pull/141
Re: [geometry] IO Modules
Hello. Le lun. 25 janv. 2021 à 13:40, Matt Juntunen a écrit : > > Hello, > > I have two main goals for the IO modules here: > > 1. Provide a simple, high-level API (i.e. IO3D) for reading and writing > geometry with a minimum of fuss. Sure. But this high-level API looks like "syntactic sugar" that can certainly be done in a number of ways (with more or less code and/or more or less flexibility). > 2. Provide a low-level, extensible API specific to each data format that > can be used to access addition format-specific information while reading and > provide greater control over the output while writing. > > So, there are actually two different APIs in question here. Users could use > the high-level API when only the geometry itself is of interest and the > low-level API when additional metadata is required. Useful examples of this > metadata are the object and group names from the OBJ format (which can be > used to store separate geometries in a single file) and the facet attribute > bytes in binary STL files (which are sometimes used to store color > information or other values). This information does not map directly to any > data structures in commons-geometry but it is certainly useful to be able to > access it (I will want to do so in my day job, for instance). In effect, how would one map (application) data that is tied to a (library) facet instance? > > > Such customization could also be handled at the application level through > a (handler-specific) property file. > > I'd rather not deal with configuration files and keep things simple and > lightweight. Maybe I was not clear because I don't see how this configuration file(s) make it less lightweight from a casual user's POV. A default configuration would be provided (that associates default extensions with the library-provided handlers). User could easily append extensions and handlers, rather than having to do it in code. It also seems like a feature that one could disable a handler (rather that having code always loaded for a format which the user doesn't actually want to use). > > Then the case for the "enum" is moot (IIUC). > > Yes, it might be. I would like to allow format names to be mapped to more > than one file extension, though. I don't understand how "enum" and multiple extensions are related... My remark was about the "enum" usage in order to enforce a single API for file format (but I understood that your use-case requires more flexibility). > > > User-code should be in charge of associating input (e.g. file name) with > > how to handle it (e.g. the instantiation of the read handler). > > This would be the case for the low-level API, but I want the high-level API > to be able to handle this itself, based on its configuration. I want to be > able to call 'IO3D.read(Paths.get("cube.obj"))' This would just work (the devil being in the details) with the hypothetical "handlers.conf" file containing: ---CUT--- formats=OBJ OBJ.extensions=obj,OBJ OBJ.reader=org.apache.commons.geometry.io.euclidean.threed.ObjFormatReader ---CUT-- Regards, Gilles > just as I might call 'ImageIO.read(new File("image.png"))'. > > Regards, > Matt J > > > From: Gilles Sadowski > Sent: Saturday, January 23, 2021 9:40 AM > To: Commons Developers List > Subject: Re: [geometry] IO Modules > > Hi. > > Le ven. 22 janv. 2021 à 03:38, Matt Juntunen > a écrit : > > > > Hi Gilles, > > > > > Really, the main point is to separate format (contents) from filename > > > (container). > > > > This makes sense. What would you think of the approach below? > > I have no strong objections, as I do not graps all the requirements. > [Maybe, IO-related stuff is always bound to be messy (cf. "java.io" vs > "java.nio").] > > > This would separate the format name from the file extension(s) and provide > > an enum containing default format information and handlers. Usage of the > > enum would be optional since there would still be overloads that accept a > > simple format name string. > > It reminds me of a discussion concerning "Bloom filters", about identifiers > for a hash function that could user-defined. > IIRC, one idea (proposed by Alex) was to maintain a text file of (unique) > identifiers. > > > For the BoundaryIOManager methods that accept a Path or URL, the format > > would still be determined by the file extension. > > I'm uncomfortable with having that kind of assumption in a low-level library > (bad reminiscence of M$-DOS days). User-code should be in charge of
Re: [geometry] IO Modules
Hello, I have two main goals for the IO modules here: 1. Provide a simple, high-level API (i.e. IO3D) for reading and writing geometry with a minimum of fuss. 2. Provide a low-level, extensible API specific to each data format that can be used to access addition format-specific information while reading and provide greater control over the output while writing. So, there are actually two different APIs in question here. Users could use the high-level API when only the geometry itself is of interest and the low-level API when additional metadata is required. Useful examples of this metadata are the object and group names from the OBJ format (which can be used to store separate geometries in a single file) and the facet attribute bytes in binary STL files (which are sometimes used to store color information or other values). This information does not map directly to any data structures in commons-geometry but it is certainly useful to be able to access it (I will want to do so in my day job, for instance). > Such customization could also be handled at the application level through a (handler-specific) property file. I'd rather not deal with configuration files and keep things simple and lightweight. > Then the case for the "enum" is moot (IIUC). Yes, it might be. I would like to allow format names to be mapped to more than one file extension, though. > User-code should be in charge of associating input (e.g. file name) with how > to handle it (e.g. the instantiation of the read handler). This would be the case for the low-level API, but I want the high-level API to be able to handle this itself, based on its configuration. I want to be able to call 'IO3D.read(Paths.get("cube.obj"))' just as I might call 'ImageIO.read(new File("image.png"))'. Regards, Matt J From: Gilles Sadowski Sent: Saturday, January 23, 2021 9:40 AM To: Commons Developers List Subject: Re: [geometry] IO Modules Hi. Le ven. 22 janv. 2021 à 03:38, Matt Juntunen a écrit : > > Hi Gilles, > > > Really, the main point is to separate format (contents) from filename > > (container). > > This makes sense. What would you think of the approach below? I have no strong objections, as I do not graps all the requirements. [Maybe, IO-related stuff is always bound to be messy (cf. "java.io" vs "java.nio").] > This would separate the format name from the file extension(s) and provide an > enum containing default format information and handlers. Usage of the enum > would be optional since there would still be overloads that accept a simple > format name string. It reminds me of a discussion concerning "Bloom filters", about identifiers for a hash function that could user-defined. IIRC, one idea (proposed by Alex) was to maintain a text file of (unique) identifiers. > For the BoundaryIOManager methods that accept a Path or URL, the format would > still be determined by the file extension. I'm uncomfortable with having that kind of assumption in a low-level library (bad reminiscence of M$-DOS days). User-code should be in charge of associating input (e.g. file name) with how to handle it (e.g. the instantiation of the read handler). > If users want to use a non-standard file extension, they can open the IO > stream themselves and use the read/write methods that accept an IO stream and > format string name or Format instance. What is "standard"/"non-standard"? You use "txt", but the most standard meaning of this extension is that the contents is ASCII-encoded... And "csv" is also not sufficient to convery that contents is actually much more constrained than a comma-separated list of strings. Couldn't a file be used to define which read/writer the library should instantiate, and to which extension it could be associated? > > interface Format { > String getName(); > List getFileExtensions(); > } > > class BoundaryIOManager { > void register(BoundaryFormat fmt, BoundaryReadHandler rh, > BoundaryWriteHandler wh) { > register(fmt.getName(), fmt.getFileExtensions(), rh, wh); > } > void register(String formatName, List extensions, > BoundaryReadHandler rh, BoundaryWriteHandler wh) {...} > > // ... > > void write(BoundarySource src, OutputStream out, Format fmt) { > write(src, in, fmt.getName()); > } > void write(BoundarySource src, OutputStream out, String formatName) > {...} > > // similar read methods ... > } > > enum StandardFormat3D implements Format { > OBJ(...), > TXT(...), > CSV(...); > > public String getName() {...} >
Re: [geometry] IO Modules
Hi. Le ven. 22 janv. 2021 à 03:38, Matt Juntunen a écrit : > > Hi Gilles, > > > Really, the main point is to separate format (contents) from filename > > (container). > > This makes sense. What would you think of the approach below? I have no strong objections, as I do not graps all the requirements. [Maybe, IO-related stuff is always bound to be messy (cf. "java.io" vs "java.nio").] > This would separate the format name from the file extension(s) and provide an > enum containing default format information and handlers. Usage of the enum > would be optional since there would still be overloads that accept a simple > format name string. It reminds me of a discussion concerning "Bloom filters", about identifiers for a hash function that could user-defined. IIRC, one idea (proposed by Alex) was to maintain a text file of (unique) identifiers. > For the BoundaryIOManager methods that accept a Path or URL, the format would > still be determined by the file extension. I'm uncomfortable with having that kind of assumption in a low-level library (bad reminiscence of M$-DOS days). User-code should be in charge of associating input (e.g. file name) with how to handle it (e.g. the instantiation of the read handler). > If users want to use a non-standard file extension, they can open the IO > stream themselves and use the read/write methods that accept an IO stream and > format string name or Format instance. What is "standard"/"non-standard"? You use "txt", but the most standard meaning of this extension is that the contents is ASCII-encoded... And "csv" is also not sufficient to convery that contents is actually much more constrained than a comma-separated list of strings. Couldn't a file be used to define which read/writer the library should instantiate, and to which extension it could be associated? > > interface Format { > String getName(); > List getFileExtensions(); > } > > class BoundaryIOManager { > void register(BoundaryFormat fmt, BoundaryReadHandler rh, > BoundaryWriteHandler wh) { > register(fmt.getName(), fmt.getFileExtensions(), rh, wh); > } > void register(String formatName, List extensions, > BoundaryReadHandler rh, BoundaryWriteHandler wh) {...} > > // ... > > void write(BoundarySource src, OutputStream out, Format fmt) { > write(src, in, fmt.getName()); > } > void write(BoundarySource src, OutputStream out, String formatName) > {...} > > // similar read methods ... > } > > enum StandardFormat3D implements Format { > OBJ(...), > TXT(...), > CSV(...); > > public String getName() {...} > public List getFileExtensions() {...} > public BoundaryReadHandler3D readHandler() { (execute a supplier > function)... } > public BoundaryWriteHandler3D writeHandler() { (execute a supplier > function)... } > } > > > The "enum" is for natively supported formats to allow for simple API, while > > "hiding" the actual implementations (as in "RandomSource" from "Commons > > RNG"). > > I'd prefer to not hide the format-specific classes, at least not completely. Then the case for the "enum" is moot (IIUC). > For example, the OBJ file format can contain a lot more information than just > pure geometry, such as object names (more than one geometry can be contained > in a single file), material information (for use in rendering), free-form > curve definitions, etc. This information is not used to produce > BoundarySource3D or Mesh instances but it can be accessed easily by extending > AbstractOBJParser or PolygonOBJParser. Also, additional information such as > comments and object names can be included in output files if the OBJWriter > class is used directly, as opposed to IO3D or BoundaryIOManager3D. It seems > like a waste to completely hide this functionality. I agree to not waste functionality. But how is the additional contents handled currently? It seems that it simply discarded, and someone wanting to retrieve it would then discard the current functionality that only return a "BoundarySource3D". Sorry if I'm missing something because of my not having read the code but this makes me think that a parser generator would have allowed for extending the support of a given format. > > Another reason to keep these classes public is that they may need to be > accessed in order to configure them. For example, the txt, csv, and obj > formats use a default format pattern for writing floating point numbers as > text. If this needs to be modified, for example to increase or decrease the > number of minimum fraction digits, then the format-specific type will need to > be accessed. The code below shows how to set a custom decimal format for OBJ > files (using the current code). > > OBJBoundaryWriteHandler3D wh = new OBJBoundaryWriteHandler3D(); > wh.setDecimalFormatPattern("0.0##"); Such customizati
Re: [geometry] IO Modules
Hi Gilles, > Really, the main point is to separate format (contents) from filename > (container). This makes sense. What would you think of the approach below? This would separate the format name from the file extension(s) and provide an enum containing default format information and handlers. Usage of the enum would be optional since there would still be overloads that accept a simple format name string. For the BoundaryIOManager methods that accept a Path or URL, the format would still be determined by the file extension. If users want to use a non-standard file extension, they can open the IO stream themselves and use the read/write methods that accept an IO stream and format string name or Format instance. interface Format { String getName(); List getFileExtensions(); } class BoundaryIOManager { void register(BoundaryFormat fmt, BoundaryReadHandler rh, BoundaryWriteHandler wh) { register(fmt.getName(), fmt.getFileExtensions(), rh, wh); } void register(String formatName, List extensions, BoundaryReadHandler rh, BoundaryWriteHandler wh) {...} // ... void write(BoundarySource src, OutputStream out, Format fmt) { write(src, in, fmt.getName()); } void write(BoundarySource src, OutputStream out, String formatName) {...} // similar read methods ... } enum StandardFormat3D implements Format { OBJ(...), TXT(...), CSV(...); public String getName() {...} public List getFileExtensions() {...} public BoundaryReadHandler3D readHandler() { (execute a supplier function)... } public BoundaryWriteHandler3D writeHandler() { (execute a supplier function)... } } > The "enum" is for natively supported formats to allow for simple API, while > "hiding" the actual implementations (as in "RandomSource" from "Commons RNG"). I'd prefer to not hide the format-specific classes, at least not completely. For example, the OBJ file format can contain a lot more information than just pure geometry, such as object names (more than one geometry can be contained in a single file), material information (for use in rendering), free-form curve definitions, etc. This information is not used to produce BoundarySource3D or Mesh instances but it can be accessed easily by extending AbstractOBJParser or PolygonOBJParser. Also, additional information such as comments and object names can be included in output files if the OBJWriter class is used directly, as opposed to IO3D or BoundaryIOManager3D. It seems like a waste to completely hide this functionality. Another reason to keep these classes public is that they may need to be accessed in order to configure them. For example, the txt, csv, and obj formats use a default format pattern for writing floating point numbers as text. If this needs to be modified, for example to increase or decrease the number of minimum fraction digits, then the format-specific type will need to be accessed. The code below shows how to set a custom decimal format for OBJ files (using the current code). OBJBoundaryWriteHandler3D wh = new OBJBoundaryWriteHandler3D(); wh.setDecimalFormatPattern("0.0##"); IO3D.getDefaultManager().registerWriteHandler("obj", wh); One additional question that I thought of while looking at your example code: what is our convention for class names that contain acronyms or other sequences of capitalized letters? In other words, should it be OBJWriter or ObjWriter? Regards, Matt J From: Gilles Sadowski Sent: Thursday, January 21, 2021 7:40 AM To: Commons Developers List Subject: Re: [geometry] IO Modules Hello. Le mer. 20 janv. 2021 à 23:55, Matt Juntunen a écrit : > > Hi Gilles, > > I've updated the PR with the new module/package names. > > > I don't see the link between "(not) extensible" and "enum": Extensibility > > is provided by API (which classes are public and meant to be reused, e.g. > > by custom IO code) while the "enum" defines the file formats that are > > "natively" supported by this library. > > I might not be picturing this correctly so perhaps a code example would help. > Here is what I'd like to be able to do with the IO code: > > // add custom handler for my own file format > IO3D.getDefaultManager().registerReadHandler("matt", new > MattFileReadHandler()); > > // read a file using that format > BoundarySource3D result = IO3D.read(Paths.get("mygeometry.matt")); > > I don't see how the above could be accomplished if the supported formats are > in an enum. The "enum" is for natively supported formats to allow for simple API, while "hiding&
Re: [geometry] IO Modules
Hello. Le mer. 20 janv. 2021 à 23:55, Matt Juntunen a écrit : > > Hi Gilles, > > I've updated the PR with the new module/package names. > > > I don't see the link between "(not) extensible" and "enum": Extensibility > > is provided by API (which classes are public and meant to be reused, e.g. > > by custom IO code) while the "enum" defines the file formats that are > > "natively" supported by this library. > > I might not be picturing this correctly so perhaps a code example would help. > Here is what I'd like to be able to do with the IO code: > > // add custom handler for my own file format > IO3D.getDefaultManager().registerReadHandler("matt", new > MattFileReadHandler()); > > // read a file using that format > BoundarySource3D result = IO3D.read(Paths.get("mygeometry.matt")); > > I don't see how the above could be accomplished if the supported formats are > in an enum. The "enum" is for natively supported formats to allow for simple API, while "hiding" the actual implementations (as in "RandomSource" from "Commons RNG"). Really, the main point is to separate format (contents) from filename (container). // Library code. /** Native file formats. */ enum FileFormat { // "ObjFileReadHandler" and "ObjFileWriteHandler" must be "internal". OBJ(new ObjFileReadHandler(), new ObjFileWriteHandler()); private final FileReadHandler rh; private final FileReadHandler wh; FileFormat(FileReadHandler rh, FileWriteHandle wh) { this.rh = rh; this.wh = wh; } // Methods to support "register" and "registerSuffix" (see below). } // Method does not exist yet. IO3D.getDefaultManager().register(FileFormat.OBJ); // In user code // Choose format, and file name (arbitrary). IO3D.write(src, filename, FileFormat.OBJ); // Associate user-preferred suffix(es) to a natively supported file format... IO3D.getDefaultManager().registerSuffix(FileFormat.OBJ, ".obj", ".OBJ"); // Method does not exist yet. // .. and the library will add the default suffix (".obj"). IO3D.write(src, filenamePrefix, FileFormat.OBJ); // User-defined handlers. final String id = IO3D.getDefaultManager().register(new MattFileReadHandler(), new MattFileWriteHandler()); // Method does not exist yet. IO3D.getDefaultManager().registerSuffix(id, "matt"); > Also, the file extension approach above is similar to that used by > javax.imageio.ImageIO. IIUC, they also make the difference pointed out above: https://docs.oracle.com/javase/7/docs/api/javax/imageio/ImageIO.html#getImageReadersByMIMEType(java.lang.String) https://docs.oracle.com/javase/7/docs/api/javax/imageio/ImageIO.html#getImageReadersBySuffix(java.lang.String) Sorry if I've cut some corners... Regards, Gilles > > Regards, > Matt > > > From: Gilles Sadowski > Sent: Wednesday, January 20, 2021 5:28 PM > To: Commons Developers List > Subject: Re: [geometry] IO Modules > > Le mer. 20 janv. 2021 à 22:42, Matt Juntunen > a écrit : > > > > Hi Gilles, > > > > > package "io" does not belong in "core". > > > > Ah, yes. I'd like to keep the core and euclidean IO modules separate since > > I'm picturing a spherical IO module later on down the road. In that case, > > perhaps we could just put the "io" portion of the module/package name > > before everything else? That would give us > > > > commons-geometry-io-core - contains package o.a.c.geometry.io.core > > commons-geometry-io-euclidean - contains package > > o.a.c.geometry.io.euclidean > > Yes; having "io" as the top-level package is what I was suggesting. > > > > > > Then, we had also discussed that it would have been more robust to use a > > > parser generator... > > > > Yes. I looked into that and ended up deciding that a full parser-generator > > would be overkill for what I wanted to do and possibly more work to > > maintain/customize. I've placed the current low-level parsing code > > (consisting of the CharReadBuffer and SimpleTextParser classes) in the > > "internal" package o.a.c.geometry.core.io.internal. I don't intend for > > these classes to be part of the public API, which would allow us to switch > > to a parser-generator later if needed. > > OK! > > > > > > I haven't read all the code (sorry!) > > > > No worries. There is a lot of it :-) > > > > > I'm not convinced by the usage of file "exten
Re: [geometry] IO Modules
Hi Gilles, I've updated the PR with the new module/package names. > I don't see the link between "(not) extensible" and "enum": Extensibility is > provided by API (which classes are public and meant to be reused, e.g. by > custom IO code) while the "enum" defines the file formats that are "natively" > supported by this library. I might not be picturing this correctly so perhaps a code example would help. Here is what I'd like to be able to do with the IO code: // add custom handler for my own file format IO3D.getDefaultManager().registerReadHandler("matt", new MattFileReadHandler()); // read a file using that format BoundarySource3D result = IO3D.read(Paths.get("mygeometry.matt")); I don't see how the above could be accomplished if the supported formats are in an enum. Also, the file extension approach above is similar to that used by javax.imageio.ImageIO. Regards, Matt From: Gilles Sadowski Sent: Wednesday, January 20, 2021 5:28 PM To: Commons Developers List Subject: Re: [geometry] IO Modules Le mer. 20 janv. 2021 à 22:42, Matt Juntunen a écrit : > > Hi Gilles, > > > package "io" does not belong in "core". > > Ah, yes. I'd like to keep the core and euclidean IO modules separate since > I'm picturing a spherical IO module later on down the road. In that case, > perhaps we could just put the "io" portion of the module/package name before > everything else? That would give us > > commons-geometry-io-core - contains package o.a.c.geometry.io.core > commons-geometry-io-euclidean - contains package > o.a.c.geometry.io.euclidean Yes; having "io" as the top-level package is what I was suggesting. > > > Then, we had also discussed that it would have been more robust to use a > > parser generator... > > Yes. I looked into that and ended up deciding that a full parser-generator > would be overkill for what I wanted to do and possibly more work to > maintain/customize. I've placed the current low-level parsing code > (consisting of the CharReadBuffer and SimpleTextParser classes) in the > "internal" package o.a.c.geometry.core.io.internal. I don't intend for these > classes to be part of the public API, which would allow us to switch to a > parser-generator later if needed. OK! > > > I haven't read all the code (sorry!) > > No worries. There is a lot of it :-) > > > I'm not convinced by the usage of file "extensions" to determine how the > > contents should be parsed. At first sight, an "enum"-based approach[3] > > would be more flexible and clearer from a code design POV. > > In contrast to the rest of the library, I'd like to make these IO modules as > extensible as possible, the reason being that they form the main connection > between the library and external systems. If we use an enum to indicate the > format, we restrict usage of the API to only those formats. A string name, on > the other hand, allows users to define their own formats and use them with > the API. [I'm only answering based on the impression I get from your description above.] I don't see the link between "(not) extensible" and "enum": Extensibility is provided by API (which classes are public and meant to be reused, e.g. by custom IO code) while the "enum" defines the file formats that are "natively" supported by this library. [And, in time, custom code can become part of the supported formats and get its corresponding "enum" element.] The file extensions could be supported too, but I would think at a higher level; mapping to either the "enum", or to custom code. > For example, we will soon be creating a custom geometry format for an > application at my day job and I would like to be able to use these IO modules > to read and write that format (using custom BoundaryReadHandler3D and > BoundaryWriteHandler3D implementations). Is what I wrote above preventing that? Best, Gilles > > Thoughts? > > Regards, > Matt J > > > > > From: Gilles Sadowski > Sent: Wednesday, January 20, 2021 9:17 AM > To: Commons Developers List > Subject: Re: [geometry] IO Modules > > Hi Matt. > > Le mer. 20 janv. 2021 à 05:03, Matt Juntunen > a écrit : > > > > Hello, > > > > I've created GEOMETRY-115 and an associated PR [1] containing new modules > > for IO functionality. The new modules are > > > > * commons-geometry-core-io - Common space-independent interfaces and > > classes > > * commons-geometry-euclidean-i
Re: [geometry] IO Modules
Le mer. 20 janv. 2021 à 22:42, Matt Juntunen a écrit : > > Hi Gilles, > > > package "io" does not belong in "core". > > Ah, yes. I'd like to keep the core and euclidean IO modules separate since > I'm picturing a spherical IO module later on down the road. In that case, > perhaps we could just put the "io" portion of the module/package name before > everything else? That would give us > > commons-geometry-io-core - contains package o.a.c.geometry.io.core > commons-geometry-io-euclidean - contains package > o.a.c.geometry.io.euclidean Yes; having "io" as the top-level package is what I was suggesting. > > > Then, we had also discussed that it would have been more robust to use a > > parser generator... > > Yes. I looked into that and ended up deciding that a full parser-generator > would be overkill for what I wanted to do and possibly more work to > maintain/customize. I've placed the current low-level parsing code > (consisting of the CharReadBuffer and SimpleTextParser classes) in the > "internal" package o.a.c.geometry.core.io.internal. I don't intend for these > classes to be part of the public API, which would allow us to switch to a > parser-generator later if needed. OK! > > > I haven't read all the code (sorry!) > > No worries. There is a lot of it :-) > > > I'm not convinced by the usage of file "extensions" to determine how the > > contents should be parsed. At first sight, an "enum"-based approach[3] > > would be more flexible and clearer from a code design POV. > > In contrast to the rest of the library, I'd like to make these IO modules as > extensible as possible, the reason being that they form the main connection > between the library and external systems. If we use an enum to indicate the > format, we restrict usage of the API to only those formats. A string name, on > the other hand, allows users to define their own formats and use them with > the API. [I'm only answering based on the impression I get from your description above.] I don't see the link between "(not) extensible" and "enum": Extensibility is provided by API (which classes are public and meant to be reused, e.g. by custom IO code) while the "enum" defines the file formats that are "natively" supported by this library. [And, in time, custom code can become part of the supported formats and get its corresponding "enum" element.] The file extensions could be supported too, but I would think at a higher level; mapping to either the "enum", or to custom code. > For example, we will soon be creating a custom geometry format for an > application at my day job and I would like to be able to use these IO modules > to read and write that format (using custom BoundaryReadHandler3D and > BoundaryWriteHandler3D implementations). Is what I wrote above preventing that? Best, Gilles > > Thoughts? > > Regards, > Matt J > > > > > From: Gilles Sadowski > Sent: Wednesday, January 20, 2021 9:17 AM > To: Commons Developers List > Subject: Re: [geometry] IO Modules > > Hi Matt. > > Le mer. 20 janv. 2021 à 05:03, Matt Juntunen > a écrit : > > > > Hello, > > > > I've created GEOMETRY-115 and an associated PR [1] containing new modules > > for IO functionality. The new modules are > > > > * commons-geometry-core-io - Common space-independent interfaces and > > classes > > * commons-geometry-euclidean-io - Euclidean IO classes; currently > > contains support for the 3D formats TXT, CSV, and OBJ > > > > The API is based on a core BoundaryIOManager class that delegates to > > BoundaryReadHandler and BoundaryWriteHandler implementations based on the > > requested data format. For Euclidean 3D space, a convenience IO3D class is > > provided with static methods that delegate to a default manager instance. > > In addition to reading and writing the core geometric types for the library > > (ConvexPolygon3D, Triangle3D), the Euclidean module also supports reading > > and writing a FacetDefinition interface, which exposes simple, unvalidated > > geometric data. This is intended for accessing raw (possibly invalid) > > geometric data from files and writing data contained in external data > > structures (for example, a custom facet class). The example below is from > > the IO3D class documentation and demonstrates a read, transform, write > > operation using streams. > > > > final Path origFile = tempDir.resolve("orig.obj&quo
Re: [geometry] IO Modules
Hi Gilles, > package "io" does not belong in "core". Ah, yes. I'd like to keep the core and euclidean IO modules separate since I'm picturing a spherical IO module later on down the road. In that case, perhaps we could just put the "io" portion of the module/package name before everything else? That would give us commons-geometry-io-core - contains package o.a.c.geometry.io.core commons-geometry-io-euclidean - contains package o.a.c.geometry.io.euclidean > Then, we had also discussed that it would have been more robust to use a > parser generator... Yes. I looked into that and ended up deciding that a full parser-generator would be overkill for what I wanted to do and possibly more work to maintain/customize. I've placed the current low-level parsing code (consisting of the CharReadBuffer and SimpleTextParser classes) in the "internal" package o.a.c.geometry.core.io.internal. I don't intend for these classes to be part of the public API, which would allow us to switch to a parser-generator later if needed. > I haven't read all the code (sorry!) No worries. There is a lot of it :-) > I'm not convinced by the usage of file "extensions" to determine how the > contents should be parsed. At first sight, an "enum"-based approach[3] would > be more flexible and clearer from a code design POV. In contrast to the rest of the library, I'd like to make these IO modules as extensible as possible, the reason being that they form the main connection between the library and external systems. If we use an enum to indicate the format, we restrict usage of the API to only those formats. A string name, on the other hand, allows users to define their own formats and use them with the API. For example, we will soon be creating a custom geometry format for an application at my day job and I would like to be able to use these IO modules to read and write that format (using custom BoundaryReadHandler3D and BoundaryWriteHandler3D implementations). Thoughts? Regards, Matt J ________ From: Gilles Sadowski Sent: Wednesday, January 20, 2021 9:17 AM To: Commons Developers List Subject: Re: [geometry] IO Modules Hi Matt. Le mer. 20 janv. 2021 à 05:03, Matt Juntunen a écrit : > > Hello, > > I've created GEOMETRY-115 and an associated PR [1] containing new modules for > IO functionality. The new modules are > > * commons-geometry-core-io - Common space-independent interfaces and > classes > * commons-geometry-euclidean-io - Euclidean IO classes; currently > contains support for the 3D formats TXT, CSV, and OBJ > > The API is based on a core BoundaryIOManager class that delegates to > BoundaryReadHandler and BoundaryWriteHandler implementations based on the > requested data format. For Euclidean 3D space, a convenience IO3D class is > provided with static methods that delegate to a default manager instance. In > addition to reading and writing the core geometric types for the library > (ConvexPolygon3D, Triangle3D), the Euclidean module also supports reading and > writing a FacetDefinition interface, which exposes simple, unvalidated > geometric data. This is intended for accessing raw (possibly invalid) > geometric data from files and writing data contained in external data > structures (for example, a custom facet class). The example below is from the > IO3D class documentation and demonstrates a read, transform, write operation > using streams. > > final Path origFile = tempDir.resolve("orig.obj"); > final Path scaledFile = tempDir.resolve("scaled.csv"); > > final DoublePrecisionContext precision = new > EpsilonDoublePrecisionContext(1e-10); > final BoundarySource3D src = Parallelepiped.unitCube(precision); > > IO3D.write(src, origFile); > > final AffineTransformMatrix3D transform = > AffineTransformMatrix3D.createScale(2); > > try (Stream stream = IO3D.triangles(origFile, precision)) > { > IO3D.write(stream.map(t -> t.transform(transform)), scaledFile); > } > > Feedback is welcome. The high-level API, illustrated in the example above, looks quite fine. But we should enforce "separation of concerns"; in this particular case: package "io" does not belong in "core". The Java package org.apache.commons.geometry.core is defined the within the (maven) module commons-geometry-core The new Java package org.apache.commons.geometry.core.io is defined the within the (maven) module commons-geometry-core-io The (maven) module name is also often used to define the Java 9+ module name for JPMS.[1] IIUC, the package org.apache.commons.geometry.core and its "sub-pa
Re: [geometry] IO Modules
Hi Matt. Le mer. 20 janv. 2021 à 05:03, Matt Juntunen a écrit : > > Hello, > > I've created GEOMETRY-115 and an associated PR [1] containing new modules for > IO functionality. The new modules are > > * commons-geometry-core-io - Common space-independent interfaces and > classes > * commons-geometry-euclidean-io - Euclidean IO classes; currently > contains support for the 3D formats TXT, CSV, and OBJ > > The API is based on a core BoundaryIOManager class that delegates to > BoundaryReadHandler and BoundaryWriteHandler implementations based on the > requested data format. For Euclidean 3D space, a convenience IO3D class is > provided with static methods that delegate to a default manager instance. In > addition to reading and writing the core geometric types for the library > (ConvexPolygon3D, Triangle3D), the Euclidean module also supports reading and > writing a FacetDefinition interface, which exposes simple, unvalidated > geometric data. This is intended for accessing raw (possibly invalid) > geometric data from files and writing data contained in external data > structures (for example, a custom facet class). The example below is from the > IO3D class documentation and demonstrates a read, transform, write operation > using streams. > > final Path origFile = tempDir.resolve("orig.obj"); > final Path scaledFile = tempDir.resolve("scaled.csv"); > > final DoublePrecisionContext precision = new > EpsilonDoublePrecisionContext(1e-10); > final BoundarySource3D src = Parallelepiped.unitCube(precision); > > IO3D.write(src, origFile); > > final AffineTransformMatrix3D transform = > AffineTransformMatrix3D.createScale(2); > > try (Stream stream = IO3D.triangles(origFile, precision)) > { > IO3D.write(stream.map(t -> t.transform(transform)), scaledFile); > } > > Feedback is welcome. The high-level API, illustrated in the example above, looks quite fine. But we should enforce "separation of concerns"; in this particular case: package "io" does not belong in "core". The Java package org.apache.commons.geometry.core is defined the within the (maven) module commons-geometry-core The new Java package org.apache.commons.geometry.core.io is defined the within the (maven) module commons-geometry-core-io The (maven) module name is also often used to define the Java 9+ module name for JPMS.[1] IIUC, the package org.apache.commons.geometry.core and its "sub-package" org.apache.commons.geometry.core.io cannot belong to different (JPMS) modules, because (IIUC) a package hierarchy cannot be shared across modules (since one of the purposes is to organize "visibility"). Components should be JPMS-compatible.[2] Fixing that would just require to 1. remove the two modules introduced in PR #130 commons-geometry-core-io commons-geometry-euclidean-io 2. create one (maven) module commons-geometry-io owning the Java packages org.apache.commons.geometry.io org.apache.commons.geometry.io.core org.apache.commons.geometry.io.euclidean [If you think that it's really worth having separate modules for loading "euclidean" data vs other space-types (not yet implemented IIUC), point (2) would be adapted accordingly.] This would fix both the JPMS issue, and the design issue. Then, we had also discussed that it would have been more robust to use a parser generator... In either case, I'd suggest that we put the actual I/O classes in an "internal" package, and only advertize the above high-level API (i.e. we should not be expected to maintain compatibility of the parser code). I haven't read all the code (sorry!), but in class "IO3D", I'm not convinced by the usage of file "extensions" to determine how the contents should be parsed. At first sight, an "enum"-based approach[3] would be more flexible and clearer from a code design POV. Regards, Gilles [1] https://www.oracle.com/corporate/features/understanding-java-9-modules.html [2] See https://gitbox.apache.org/repos/asf?p=commons-rng.git;a=tree;f=commons-rng-examples/examples-jpms;hb=HEAD [3] See https://gitbox.apache.org/repos/asf?p=commons-rng.git;a=blob;f=commons-rng-simple/src/main/java/org/apache/commons/rng/simple/RandomSource.java;h=65ca02fdc285fbb7ea8305008dbce21f571191d7;hb=HEAD > > Regards, > Matt J > > [1] https://github.com/apache/commons-geometry/pull/130 - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org