I don't know what the general feeling is but I've always felt that there should be an ETL Top level module namespace. ( if you don't count practical extraction and reporting language :) The issue is, there doesn't appear to be very good community consensus on best practices for ETL behavior or methods. I suspect the variation in that namespace early on might be distracting? Or maybe if you build it they will come?
I notice that you have Extract and Load covered in your proposal. Do you also have transform and logging on the way? Best Regards, Jed (JANDREW <https://metacpan.org/author/JANDREW>) On Tue, May 3, 2016 at 2:23 AM, Nelson Ferraz <nfer...@gmail.com> wrote: > I'm the maintainer of the DataWarehouse::* modules. > > Let me know if you would like to use the DataWarehouse::ETL namespace. > > > > On Tue, May 3, 2016 at 10:36 AM, Smylers <smyl...@stripey.com> wrote: > >> Robert Wohlfarth writes: >> >> > I am looking to release a collection of modules for converting data. >> > The modules read data from a source, convert the data, then add it >> > into an SQL database. >> > >> > The modules are named like this... >> > * Data::ETL >> > * Data::ETL::Extract >> > * Data::ETL::Extract::Excel >> > * Data::ETL::Extract::DelimitedText >> > * Data::ETL::Extract::XML >> > * Data::ETL::Load >> > * Data::ETL::MSAccess >> > >> > In my mind, ETL means "Extract-Transform-Load". >> >> That wouldn't've occurred to me, but the Wikipedia page for ‘Extra, >> transform, load’ is the top link when searching DuckDuckGo for “ETL”, so >> it seems reasonable to use it in a module name if your target audience >> is people already working in the field and familiar with its jargon. >> >> > Is "Data" an appropriate place? >> >> Yes ... and no. Data:: is appropriate for pretty much every module on >> Cpan, in that an awful lot of code does stuff with data. That makes it a >> suboptimal namespace, because it doesn't define what's specific about >> this particular module. >> >> In particular, it didn't to me suggest databases, or even data >> warehousing (which the ETL Wikipedia page suggests is the main use of >> ETL). It'd be good for the name to indicate that field in some way. >> >> > Thoughts on the naming convention "Data::ETL"? >> >> The combination of a very broad namespace and an acronym makes it hard >> to guess at the area of the module — for instance that would be an >> equally good name for a module that processes data searching for >> extra-terrestrial life ... >> >> If the database-loading part uses DBI connections then the DBIx:: >> namespace would be good for indicating that. >> >> Unfortunately for you, DataWarehouse::ETL is already used by another >> module. Ideally you'd mention that module in your docs, explaining to >> new users the difference between them. If your name can help to indicate >> the distinctive feature of yours, so much the better — but often that >> isn't possible if they are simply different approaches to the same >> problem. >> >> One possibility for a suite of connected modules that only really work >> together is to concoct a ‘fanciful’ brand name for the framework, like >> Moose or Catalyst and put all your modules under either $Brand:: or >> something like DataWarehouse::$Brand::. >> >> A framework name works well if, say, your $whatever::Extract::Excel >> module is only intended to be used with other modules in your framework >> and doesn't really make sense as a standalone module for somebody just >> wanting to extract data from an Excel spreadsheet (and get back a Perl >> data structure they can do what they want with). The brand name >> indicates that it's part of the framework and to be used with that. >> >> Hope that helps. >> >> Smylers >> -- >> http://twitter.com/Smylers2 >> > > > > -- > Nelson Ferraz >