Hello. Le lun. 25 janv. 2021 à 13:40, Matt Juntunen <matt.juntu...@hotmail.com> a écrit : > > Hello, > > I have two main goals for the IO modules here: > > 1. Provide a simple, high-level API (i.e. IO3D) for reading and writing > geometry with a minimum of fuss.
Sure. But this high-level API looks like "syntactic sugar" that can certainly be done in a number of ways (with more or less code and/or more or less flexibility). > 2. Provide a low-level, extensible API specific to each data format that > can be used to access addition format-specific information while reading and > provide greater control over the output while writing. > > So, there are actually two different APIs in question here. Users could use > the high-level API when only the geometry itself is of interest and the > low-level API when additional metadata is required. Useful examples of this > metadata are the object and group names from the OBJ format (which can be > used to store separate geometries in a single file) and the facet attribute > bytes in binary STL files (which are sometimes used to store color > information or other values). This information does not map directly to any > data structures in commons-geometry but it is certainly useful to be able to > access it (I will want to do so in my day job, for instance). In effect, how would one map (application) data that is tied to a (library) facet instance? > > > Such customization could also be handled at the application level through > a (handler-specific) property file. > > I'd rather not deal with configuration files and keep things simple and > lightweight. Maybe I was not clear because I don't see how this configuration file(s) make it less lightweight from a casual user's POV. A default configuration would be provided (that associates default extensions with the library-provided handlers). User could easily append extensions and handlers, rather than having to do it in code. It also seems like a feature that one could disable a handler (rather that having code always loaded for a format which the user doesn't actually want to use). > > Then the case for the "enum" is moot (IIUC). > > Yes, it might be. I would like to allow format names to be mapped to more > than one file extension, though. I don't understand how "enum" and multiple extensions are related... My remark was about the "enum" usage in order to enforce a single API for file format (but I understood that your use-case requires more flexibility). > > > User-code should be in charge of associating input (e.g. file name) with > > how to handle it (e.g. the instantiation of the read handler). > > This would be the case for the low-level API, but I want the high-level API > to be able to handle this itself, based on its configuration. I want to be > able to call 'IO3D.read(Paths.get("cube.obj"))' This would just work (the devil being in the details) with the hypothetical "handlers.conf" file containing: ---CUT--- formats=OBJ OBJ.extensions=obj,OBJ OBJ.reader=org.apache.commons.geometry.io.euclidean.threed.ObjFormatReader ---CUT-- Regards, Gilles > just as I might call 'ImageIO.read(new File("image.png"))'. > > Regards, > Matt J > > ________________________________ > From: Gilles Sadowski <gillese...@gmail.com> > Sent: Saturday, January 23, 2021 9:40 AM > To: Commons Developers List <dev@commons.apache.org> > Subject: Re: [geometry] IO Modules > > Hi. > > Le ven. 22 janv. 2021 à 03:38, Matt Juntunen > <matt.juntu...@hotmail.com> a écrit : > > > > Hi Gilles, > > > > > Really, the main point is to separate format (contents) from filename > > > (container). > > > > This makes sense. What would you think of the approach below? > > I have no strong objections, as I do not graps all the requirements. > [Maybe, IO-related stuff is always bound to be messy (cf. "java.io" vs > "java.nio").] > > > This would separate the format name from the file extension(s) and provide > > an enum containing default format information and handlers. Usage of the > > enum would be optional since there would still be overloads that accept a > > simple format name string. > > It reminds me of a discussion concerning "Bloom filters", about identifiers > for a hash function that could user-defined. > IIRC, one idea (proposed by Alex) was to maintain a text file of (unique) > identifiers. > > > For the BoundaryIOManager methods that accept a Path or URL, the format > > would still be determined by the file extension. > > I'm uncomfortable with having that kind of assumption in a low-level library > (bad reminiscence of M$-DOS days). User-code should be in charge of > associating input (e.g. file name) with how to handle it (e.g. the > instantiation > of the read handler). > > > If users want to use a non-standard file extension, they can open the IO > > stream themselves and use the read/write methods that accept an IO stream > > and format string name or Format instance. > > What is "standard"/"non-standard"? You use "txt", but the most standard > meaning of this extension is that the contents is ASCII-encoded... > And "csv" is also not sufficient to convery that contents is actually much > more constrained than a comma-separated list of strings. > > Couldn't a file be used to define which read/writer the library should > instantiate, and to which extension it could be associated? > > > > > interface Format { > > String getName(); > > List<String> getFileExtensions(); > > } > > > > class BoundaryIOManager { > > void register(BoundaryFormat fmt, BoundaryReadHandler rh, > > BoundaryWriteHandler wh) { > > register(fmt.getName(), fmt.getFileExtensions(), rh, wh); > > } > > void register(String formatName, List<String> extensions, > > BoundaryReadHandler rh, BoundaryWriteHandler wh) {...} > > > > // ... > > > > void write(BoundarySource src, OutputStream out, Format fmt) { > > write(src, in, fmt.getName()); > > } > > void write(BoundarySource src, OutputStream out, String formatName) > > {...} > > > > // similar read methods ... > > } > > > > enum StandardFormat3D implements Format { > > OBJ(...), > > TXT(...), > > CSV(...); > > > > public String getName() {...} > > public List<String> getFileExtensions() {...} > > public BoundaryReadHandler3D readHandler() { (execute a supplier > > function)... } > > public BoundaryWriteHandler3D writeHandler() { (execute a supplier > > function)... } > > } > > > > > The "enum" is for natively supported formats to allow for simple API, > > > while "hiding" the actual implementations (as in "RandomSource" from > > > "Commons RNG"). > > > > I'd prefer to not hide the format-specific classes, at least not completely. > > Then the case for the "enum" is moot (IIUC). > > > For example, the OBJ file format can contain a lot more information than > > just pure geometry, such as object names (more than one geometry can be > > contained in a single file), material information (for use in rendering), > > free-form curve definitions, etc. This information is not used to produce > > BoundarySource3D or Mesh instances but it can be accessed easily by > > extending AbstractOBJParser or PolygonOBJParser. Also, additional > > information such as comments and object names can be included in output > > files if the OBJWriter class is used directly, as opposed to IO3D or > > BoundaryIOManager3D. It seems like a waste to completely hide this > > functionality. > > I agree to not waste functionality. But how is the additional contents > handled currently? It seems that it simply discarded, and someone > wanting to retrieve it would then discard the current functionality that > only return a "BoundarySource3D". > Sorry if I'm missing something because of my not having read the code > but this makes me think that a parser generator would have allowed > for extending the support of a given format. > > > > > Another reason to keep these classes public is that they may need to be > > accessed in order to configure them. For example, the txt, csv, and obj > > formats use a default format pattern for writing floating point numbers as > > text. If this needs to be modified, for example to increase or decrease the > > number of minimum fraction digits, then the format-specific type will need > > to be accessed. The code below shows how to set a custom decimal format for > > OBJ files (using the current code). > > > > OBJBoundaryWriteHandler3D wh = new OBJBoundaryWriteHandler3D(); > > wh.setDecimalFormatPattern("0.0##"); > > Such customization could also be handled at the application level through > a (handler-specific) property file. > > It would be interesting to ask for more opinions about how to handle > configurations and files (posting a message to "[All]"). > > > > > IO3D.getDefaultManager().registerWriteHandler("obj", wh); > > > > One additional question that I thought of while looking at your example > > code: what is our convention for class names that contain acronyms or other > > sequences of capitalized letters? In other words, should it be OBJWriter or > > ObjWriter? > > I'd say "Obj..." (because only initials should be capitalized). > But "ObjWriter" in Java code could mean anything... > Perhaps "ObjFormatWriter"? > > Best, > Gilles > > >> [...] > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org > For additional commands, e-mail: dev-h...@commons.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org