Hello.

Le lun. 25 janv. 2021 à 13:40, Matt Juntunen
<matt.juntu...@hotmail.com> a écrit :
>
> Hello,
>
> I have two main goals for the IO modules here:
>
>   1.  Provide a simple, high-level API (i.e. IO3D) for reading and writing 
> geometry with a minimum of fuss.

Sure.  But this high-level API looks like "syntactic sugar" that can
certainly be
done in a number of ways (with more or less code and/or more or less
flexibility).

>   2.  Provide a low-level, extensible API specific to each data format that 
> can be used to access addition format-specific information while reading and 
> provide greater control over the output while writing.
>
> So, there are actually two different APIs in question here. Users could use 
> the high-level API when only the geometry itself is of interest and the 
> low-level API when additional metadata is required. Useful examples of this 
> metadata are the object and group names from the OBJ format (which can be 
> used to store separate geometries in a single file) and the facet attribute 
> bytes in binary STL files (which are sometimes used to store color 
> information or other values). This information does not map directly to any 
> data structures in commons-geometry but it is certainly useful to be able to 
> access it (I will want to do so in my day job, for instance).

In effect, how would one map (application) data that is tied to a
(library) facet
instance?

>
> > Such customization could also be handled at the application level through
> a (handler-specific) property file.
>
> I'd rather not deal with configuration files and keep things simple and 
> lightweight.

Maybe I was not clear because I don't see how this configuration file(s)
make it less lightweight from a casual user's POV.  A default configuration
would be provided (that associates default extensions with the library-provided
handlers).  User could easily append extensions and handlers, rather than
having to do it in code.  It also seems like a feature that one could disable a
handler (rather that having code always loaded for a format which the user
doesn't actually want to use).

> > Then the case for the "enum" is moot (IIUC).
>
> Yes, it might be. I would like to allow format names to be mapped to more 
> than one file extension, though.

I don't understand how "enum" and multiple extensions are related...
My remark was about the "enum" usage in order to enforce  a single API
for file format (but I understood that your use-case requires more flexibility).

>
> > User-code should be in charge of associating input (e.g. file name) with 
> > how to handle it (e.g. the instantiation of the read handler).
>
> This would be the case for the low-level API, but I want the high-level API 
> to be able to handle this itself, based on its configuration. I want to be 
> able to call 'IO3D.read(Paths.get("cube.obj"))'

This would just work (the devil being in the details) with the hypothetical
"handlers.conf" file containing:
---CUT---
formats=OBJ
OBJ.extensions=obj,OBJ
OBJ.reader=org.apache.commons.geometry.io.euclidean.threed.ObjFormatReader
---CUT--

Regards,
Gilles

> just as I might call 'ImageIO.read(new File("image.png"))'.
>
> Regards,
> Matt J
>
> ________________________________
> From: Gilles Sadowski <gillese...@gmail.com>
> Sent: Saturday, January 23, 2021 9:40 AM
> To: Commons Developers List <dev@commons.apache.org>
> Subject: Re: [geometry] IO Modules
>
> Hi.
>
> Le ven. 22 janv. 2021 à 03:38, Matt Juntunen
> <matt.juntu...@hotmail.com> a écrit :
> >
> > Hi Gilles,
> >
> > > Really, the main point is to separate format (contents) from filename 
> > > (container).
> >
> > This makes sense. What would you think of the approach below?
>
> I have no strong objections, as I do not graps all the requirements.
> [Maybe, IO-related stuff is always bound to be messy (cf. "java.io" vs
> "java.nio").]
>
> > This would separate the format name from the file extension(s) and provide 
> > an enum containing default format information and handlers. Usage of the 
> > enum would be optional since there would still be overloads that accept a 
> > simple format name string.
>
> It reminds me of a discussion concerning "Bloom filters", about identifiers
> for a hash function that could user-defined.
> IIRC, one idea (proposed by Alex) was to maintain a text file of (unique)
> identifiers.
>
> > For the BoundaryIOManager methods that accept a Path or URL, the format 
> > would still be determined by the file extension.
>
> I'm uncomfortable with having that kind of assumption in a low-level library
> (bad reminiscence of M$-DOS days).  User-code should be in charge of
> associating input (e.g. file name) with how to handle it (e.g. the 
> instantiation
> of the read handler).
>
> > If users want to use a non-standard file extension, they can open the IO 
> > stream themselves and use the read/write methods that accept an IO stream 
> > and format string name or Format instance.
>
> What is "standard"/"non-standard"?  You use "txt", but the most standard
> meaning of this extension is that the contents is ASCII-encoded...
> And "csv" is also not sufficient to convery that contents is actually much
> more constrained than a comma-separated list of strings.
>
> Couldn't a file be used to define which read/writer the library should
> instantiate, and to which extension it could be associated?
>
> >
> >     interface Format {
> >         String getName();
> >         List<String> getFileExtensions();
> >     }
> >
> >     class BoundaryIOManager {
> >         void register(BoundaryFormat fmt, BoundaryReadHandler rh, 
> > BoundaryWriteHandler wh) {
> >             register(fmt.getName(), fmt.getFileExtensions(), rh, wh);
> >         }
> >         void register(String formatName, List<String> extensions, 
> > BoundaryReadHandler rh, BoundaryWriteHandler wh) {...}
> >
> >         // ...
> >
> >         void write(BoundarySource src, OutputStream out, Format fmt) {
> >             write(src, in, fmt.getName());
> >         }
> >         void write(BoundarySource src, OutputStream out, String formatName) 
> > {...}
> >
> >         // similar read methods ...
> >     }
> >
> >     enum StandardFormat3D implements Format {
> >         OBJ(...),
> >         TXT(...),
> >         CSV(...);
> >
> >         public String getName() {...}
> >         public List<String> getFileExtensions() {...}
> >         public BoundaryReadHandler3D readHandler() { (execute a supplier 
> > function)... }
> >         public BoundaryWriteHandler3D writeHandler() { (execute a supplier 
> > function)... }
> >     }
> >
> > > The "enum" is for natively supported formats to allow for simple API, 
> > > while "hiding" the actual implementations (as in "RandomSource" from 
> > > "Commons RNG").
> >
> > I'd prefer to not hide the format-specific classes, at least not completely.
>
> Then the case for the "enum" is moot (IIUC).
>
> > For example, the OBJ file format can contain a lot more information than 
> > just pure geometry, such as object names (more than one geometry can be 
> > contained in a single file), material information (for use in rendering), 
> > free-form curve definitions, etc. This information is not used to produce 
> > BoundarySource3D or Mesh instances but it can be accessed easily by 
> > extending AbstractOBJParser or PolygonOBJParser. Also, additional 
> > information such as comments and object names can be included in output 
> > files if the OBJWriter class is used directly, as opposed to IO3D or 
> > BoundaryIOManager3D. It seems like a waste to completely hide this 
> > functionality.
>
> I agree to not waste functionality.  But how is the additional contents
> handled currently?  It seems that it simply discarded, and someone
> wanting to retrieve it would then discard the current functionality that
> only return a "BoundarySource3D".
> Sorry if I'm missing something because of my not having read the code
> but this makes me think that a parser generator would have allowed
> for extending the support of a given format.
>
> >
> > Another reason to keep these classes public is that they may need to be 
> > accessed in order to configure them. For example, the txt, csv, and obj 
> > formats use a default format pattern for writing floating point numbers as 
> > text. If this needs to be modified, for example to increase or decrease the 
> > number of minimum fraction digits, then the format-specific type will need 
> > to be accessed. The code below shows how to set a custom decimal format for 
> > OBJ files (using the current code).
> >
> >     OBJBoundaryWriteHandler3D wh = new OBJBoundaryWriteHandler3D();
> >     wh.setDecimalFormatPattern("0.0##");
>
> Such customization could also be handled at the application level through
> a (handler-specific) property file.
>
> It would be interesting to ask for more opinions about how to handle
> configurations and files (posting a message to "[All]").
>
> >
> >     IO3D.getDefaultManager().registerWriteHandler("obj", wh);
> >
> > One additional question that I thought of while looking at your example 
> > code: what is our convention for class names that contain acronyms or other 
> > sequences of capitalized letters? In other words, should it be OBJWriter or 
> > ObjWriter?
>
> I'd say "Obj..." (because only initials should be capitalized).
> But "ObjWriter" in Java code could mean anything...
> Perhaps "ObjFormatWriter"?
>
> Best,
> Gilles
>
> >> [...]
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Reply via email to