Hi, First I agree with bumping this issue. When at the customer, this thing caused a lot of time spent in figuring out what was going on. I am not sure if I like the extension as part of the table name, because: - I would never create a table in a relational database with a dot in the name - It creates a ambiguity. If you have a "full" path name to a column, like " documents.people.csv.name ", then it is not clear if the schema name is "documents.people" and the table name is "csv", or that the schema name is "documents" and the table name is "people.csv". It seems natural to me that schema names contain dots, but not table names.
Alternatives: - Leave the extension out of the name (probably not acceptable, because then you can no longer have two "tables" differing only in extension). Although I must say that personally I think this would be the best solution. - Use a conventional name, like: Schema name: Folder name Table name: The filename, including extension (all dots replaced by underscores). Resulting in e.g. a column path like this: documents.people_csv.name At the customer site, the file I needed to use was actually called like this pattern: "bar/FOO.PEOPLE.IN.FILE". Using the convention, this would become: bar.FOO_PEOPLE_IN_FILE IMHO this is preferable to "bar.foo.people.in.file" The problem is of course that it would now be impossible to have another file "bar/FOO_PEOPLE_IN_FILE" :-( I am happy to hear other peoples thougths. Hans -----Original Message----- From: Kasper Sørensen [mailto:[email protected]] Sent: Wednesday, August 14, 2013 10:18 AM To: [email protected] Subject: Re: [DISCUSS] use folder name as schema name for file based DataContexts Rats, made a mistake in that diff. The Gist has been updated [1] and now contains the ResourceUtils class which was missing before. [1] https://gist.github.com/kaspersorensen/6210970 2013/8/12 Kasper Sørensen <[email protected]>: > Here's a proposed patch (implemented for CSV and fixedwidth files > which are the modules that implemented the old schema naming pattern): > https://gist.github.com/kaspersorensen/6210970 > > 2013/8/10 Kasper Sørensen <[email protected]>: >> https://issues.apache.org/jira/browse/METAMODEL-4 >> >> 2013/8/10 Henry Saputra <[email protected]>: >>> What is the JIRA for this one? >>> >>> >>> On Fri, Aug 9, 2013 at 2:26 AM, Manuel van den Berg < >>> [email protected]> wrote: >>> >>>> +1 >>>> >>>> (shouldn't I just vote on the Jira for this?) >>>> >>>> manuel >>>> >>>> > -----Original Message----- >>>> > From: Kasper Sørensen [mailto:[email protected]] >>>> > Sent: Friday, August 09, 2013 9:03 >>>> > To: [email protected] >>>> > Subject: Re: [DISCUSS] use folder name as schema name for file >>>> > based DataContexts >>>> > >>>> > Allow me to bump this issue (it's my impression that more people >>>> > have >>>> joined >>>> > in a bit late, after this topic was posted). >>>> > >>>> > I think this is one of the more important issues that I would >>>> > want to fix before we make our first release at Apache. >>>> > >>>> > 2013/7/24 Kasper Sørensen <[email protected]>: >>>> > > Right now we have this slightly odd naming convention for >>>> > > schema and table names when building metadata for e.g. a CSV >>>> > > file or a fixed width value file. >>>> > > >>>> > > Schema name: The filename, including file extension. >>>> > > Table name: The filename without extension. >>>> > > Resulting in e.g. a column path like this: >>>> > > people.csv.people.name >>>> > > >>>> > > I suggest we change it to this convention: >>>> > > >>>> > > Schema name: Folder name >>>> > > Table name: The filename, including file extension. >>>> > > Resulting in e.g. a column path like this: >>>> > > documents.people.csv.name >>>> > > >>>> > > Why do I think this would be an improvement? >>>> > > >>>> > > 1) Because this would first of all make a kind of sense to the >>>> > > user to see the file system's hierarchy reflected in the schema model. >>>> > > 2) Because it allows us to make these DataContext's operate not >>>> > > on a single file, but on a directory of files. I have seen this >>>> > > quite a number of times by now that users of MetaModel, or users of >>>> > > e.g. >>>> > > DataCleaner, which uses MetaModel quite heavily, wants to do >>>> > > this sort >>>> of >>>> > stuff. >>>> > > 3) The removing of the file extension stuff is kind of broken >>>> > > and a strange convention in the first place. >>>> > > >>>> > > While this doesn't really break backwards compatibility in >>>> > > terms of Java code, it would break configuration files and >>>> > > other stuff of applications that use MetaModel. But I do >>>> > > believe that can be communicated and handled through carefully >>>> > > explaining the new convention on the migration page (that I recently >>>> > > started writing [1]). >>>> > > >>>> > > What do you think? >>>> > > >>>> > > [1] >>>> > > http://wiki.apache.org/metamodel/MigratingFromEobjectsMetaModel >>>>
