Hi On Wed, May 22, 2013 at 10:48 PM, Chris Crook <ccr...@linz.govt.nz> wrote: > Hi Régis > > Interesting thoughts. I've renamed this thread and copied it in to > qgis-developer, as I think it is worth getting broader input on this. Hope > you don't mind my putting your email there.... > > If I've understood the main points of your suggestion are: > > 1) The delimited text provider and the GUI should be able to read a VRT/CSVT > file if one is present to determine field types etc > > 2) The delimited text provider GUI should be able to save settings to a > VRT/CSVT > > 3) The user should be able to explicitly set data types for each column > > This is inline with some of my thinking in this. I was planning to add some > way of saving settings as a "delimited text file type", that could then be > selected when adding a new text layer. At the moment (ie in master, not 1.8) > the plugin remembers settings based on file extension, but that doesn't > provide enough granularity for me. This would be analogous to saving styles > in QGIS. I hadn't decided where to store the settings yet (ie whether in a > file, or in the QGIS settings, or... ) > > Like you, I am also not very happy with the way that the provider determines > field types by scanning the file when it loads it. So if you put different > data in the file, and reload it into QGIS, then the data types may change. > > I hadn't thought about using VRT or CSVT files and I really like the idea. > > As a really simple first step, which would not be a major refactoring, the > provider could check for a CSVT file when it loads as CSV and use it to > determine field types - this would not require any UI or API changes at all. > That on its own could be very useful. The only difficulty I can foresee with > this is how to manage files with names other than ".csv". Should it just > look for a file matching the name of the input file with a "t" at the end? > Or should it only use this options for files that are named ".csv". Or > should it look for ".csvt" whatever the name of the file. Other than that > this really isn't much work, and I'd be keen to implement it. I guess the > simplest approach would be the first options, looking for a file named the > same as the data file but with a "t". If it exists and can be interpreted, > then it will be used to define field types. This would make it really easy > to manage when creating data, and would be compatible with GDAL/OGR. > > The VRT file is much more work as you suggest. The mapping between VRT and > the CSV options is not complete. So the options around delimiters, skipped > lines, regular expressions, and so on are not available for the VRT file. > Conversely several of the VRT options don't apply within QGIS. So there is > quite a lot of work in specifying how this should work. Also it would entail > a major reworking in terms of how the file is opened (ie you would select the > VRT file in the GUI, which would then have to identify the CSV file, and also > handle politely VRT files which did not define CSV files, but which defined > other data source types. This seems a lot of work and would end up > re-engineering a lot of what is already in OGR. So (even as I'm writing > this) I'm becoming less clear that this is a good approach. > > Returning to the CSVT idea - once the provider can use the CSVT then there > remains the question of what GUI/API changes should be made to support it. > Two thoughts come to mind immediately. > > One is, should the CSVT idea be extended to support the other metadata > information required in setting up the file, such as the delimiter, etc. The > OGR specification for CSVT just defines the field types in the first line. I > don't know if it would ignore subsequent lines, in which case additional > metadata could go there and still be compatible with OGR usage. Or should > another metadata file (eg .metadata, .qgs, .dlt, ...) be used to hold all the > information specifying how the file should be used (sounds really messy). > But it would be really nice to be able to just select a file and have all > these options automatically populated if the metadata file existed. > > The other thought is around your suggestion of writing the CSVT/metadata > file. The main extra work involved in this is the user interface for > defining field types. > > The dialog box is already quite busy, but I guess a simple approach would be > just to add a row to the preview box under the column headings with a field > type selector for each column, and values > "Auto,Text,Integer,Real,Date,Time,DateTime", or something like that. The > field types could then be passed through to the provider in the datasource > URI (ie with a parameter such as "fieldtypes=text,text,integer,..."). This > also doesn't sound like too much work. > > Once this is done it would be simple to add a "save settings to metadata > file" type button to the GUI, which could write the CSVT/metadata file. > > This would create one more tricky question, of how to handle conflicts > between metadata read in a CSVT/metadata file and that in the datasource URI. > > Enough rambling. I expect it will be a couple of weeks before I can consider > this much more (though I may consider handling the CSVT file sooner). > > Cheers > Chris > > >> -----Original Message----- >> From: HAUBOURG [mailto:regis.haubo...@eau-adour-garonne.fr] >> Sent: Thursday, 23 May 2013 6:45 a.m. >> To: Chris Crook >> Subject: RE : Delimited text debug >> >> Thanks for your feedback Chris. >> >> I have been thinking of it all day, and got to the following observations and >> conclusions: >> >> 1- there are two concurrent ways to open csv in qgis: ogr native and your >> plugin. >> ogr gdal offers some features like vrt (enabling geometry columns, xy, yx >> columns) and csvt (basic types for columns) that remain unused in qgis >> (unless your using them for your code). >> >> 2- users have no way to create point from attribute datas, except using your >> plugin. csv export and fields types can be a pain. >> >> 3- there is no way to change a data type on the fly, so user has to do again >> the import, and sometime is trapped if no ETL or database is available or >> understood. >> >> From a user point of view, I think we should do two things: >> >> A: unify import for data sources to avoid the two different entries >> - merge all import tools based on your approach, with a previz gui (choose >> encoding, skip lines...) >> - enable others options for all text based files (field delimiter, text >> delimiter, >> decimal delimiter, trim fields.. ) >> - WKT or XY chooser for geometry fields (for all data sources: native >> geometry / no geometry/ text fields) >> - field type chooser with automatic guess (gdal does it) >> - option to save a vrt / csvt so that a user can reopen easily the data >> without >> redoing all the import stuff. >> This is a big refactoring of vector layer add dialog. Nathan add some mockups >> for this that could do. >> >> B: add a vector tool to create geometry from any attribute data (xy, wkt.. >> ) of >> any loaded data source. users that imported data could then spatialize data >> in >> a second step (like Mapinfo does) >> >> In my corp users really need that, so I probably will fund that. Do you have >> some feedback on that? >> Régis >> ________________________________________ >> De : Chris Crook [ccr...@linz.govt.nz] >> Date d'envoi : mercredi 22 mai 2013 19:54 À : HAUBOURG Objet : RE: >> Delimited text debug >> >> Hi Régis >> >> It could be a useful improvement - basically to allow setting types of >> columns. >> Associated with this I'd like to add date types, which would require some >> explicit definition by the user. This will not make 2.0, but certainly worth >> doing for the next release. >> >> As a workaround for the moment could you add an extra row of dummy data >> with a non-numeric value in the key column. The provider will then treat it >> as >> a text column and the joining should work ok.
These all sound like good changes! Regards Tim >> >> Cheers >> Chris >> ________________________________________ >> From: HAUBOURG [regis.haubo...@eau-adour-garonne.fr] >> Sent: 22 May 2013 23:12 >> To: Chris Crook >> Subject: RE: Delimited text debug >> >> Hi Chris, >> I'm facing a problem here. We have most of our administrative area >> identified with a text key, but composed only of number ("09 ", "31"). >> I have no way to choose to interpret text delimiters, and then, data is >> corrupted (09 becomes 9) and no way to join data with geographic layer.. >> Is that a possible improvement to your plugin? I will file a ticket if >> needed. >> Cheers, >> Régis >> >> This message contains information, which is confidential and may be subject >> to legal privilege. If you are not the intended recipient, you must not >> peruse, >> use, disseminate, distribute or copy this message. If you have received this >> message in error, please notify us immediately (Phone 0800 665 463 or >> i...@linz.govt.nz) and destroy the original message. LINZ accepts no >> responsibility for changes to this email, or for any attachments, after its >> transmission from LINZ. Thank You. > > > This message contains information, which is confidential and may be subject > to legal privilege. If you are not the intended recipient, you must not > peruse, use, disseminate, distribute or copy this message. If you have > received this message in error, please notify us immediately (Phone 0800 665 > 463 or i...@linz.govt.nz) and destroy the original message. LINZ accepts no > responsibility for changes to this email, or for any attachments, after its > transmission from LINZ. Thank You. > _______________________________________________ > Qgis-developer mailing list > Qgis-developer@lists.osgeo.org > http://lists.osgeo.org/mailman/listinfo/qgis-developer -- Tim Sutton - QGIS Project Steering Committee Member (Release Manager) ============================================== Please do not email me off-list with technical support questions. Using the lists will gain more exposure for your issues and the knowledge surrounding your issue will be shared with all. Visit http://linfiniti.com to find out about: * QGIS programming and support services * Mapserver and PostGIS based hosting plans * FOSS Consulting Services Skype: timlinux Irc: timlinux on #qgis at freenode.net ============================================== _______________________________________________ Qgis-developer mailing list Qgis-developer@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/qgis-developer