I should have said half was entered via the web and the other half came in via 
a range of different spreadsheets, some of which of course used commas instead 
of decimal points in numeric fields.  Only about 10,000 records by about 20 
fields so was possible to look at by hand to some extent.

-----Original Message-----
From: Alex Mandel [mailto:tech_...@wildintellect.com] 
Sent: 21 June 2011 09:44
To: qgis-user@lists.osgeo.org
Subject: Re: [Qgis-user] Delimited text plugin with fields containing commas

If it was being entered on the web it should have gone straight into a 
database. So there should be no need for a text file, it should all be db dumps 
if you need to move things. Personally I'd recommend putting it in SQlite or 
Postgres to start then you can move directly to Spatialite, Postgis - or use 
those from the beginning.

It just raises the need to really think about data entry in project design and 
to get good at writing fancy python scripts to clean up bad text files people 
give you based on regex rules and other magic.

Enjoy,
Alex

On 06/21/2011 12:48 AM, M.E.Dodd wrote:
> I've had a whole range of similar problems with a large database of 
> publically entered records from many countries, it contains all sorts of 
> strange characters (in many languages) including most characters you'd think 
> of as possibilities for column separators.  Rather a nightmare to tidy up and 
> analyse, end up stripping out loads of different characters just to get it to 
> read by spreadsheet and gis.  Ideally there might be some completely 
> different separator that could easily be edited in to show the columns and 
> keep most of the commas and other characters within the columns but the big 
> issue is how to do that editing in the first place to correctly identify the 
> columns or easily allow moving of text between columns if an automatic import 
> gets it wrong.
> I have sorted most of it out now, in my case, by long laborious means but if 
> someone could come up with a good way of dealing with this kind of messy file 
> (entered by general public in many countries so with potentially 
> unpredictable strange characters) that would be very useful.  Just before you 
> say you should have been much more restrictive on the web input, we were 
> fairly restrictive but still need to allow quite a range of possible inputs 
> in free text in any language.
> 
> From: John Callahan [mailto:john.calla...@udel.edu]
> Sent: 21 June 2011 00:36
> To: t...@wildintellect.com
> Cc: qgis-user
> Subject: Re: [Qgis-user] Delimited text plugin with fields containing 
> commas
> 
> You're correct.  That way probably would be the preferred work-around.
> 
> - John
> 
> 
> On Mon, Jun 20, 2011 at 5:18 PM, Alex Mandel 
> <tech_...@wildintellect.com<mailto:tech_...@wildintellect.com>> wrote:
> I agree quotes should work but I've found many parsers to not follow 
> the expectation on this. As for semicolons I only meant as the 
> delimiter leaving the commas inside your text. That way you can tell 
> the parser that ; is the separator between records.
> 
> Thanks,
> Alex
> 
> On 06/20/2011 01:04 PM, John Callahan wrote:
>> I use semi-colons when I can but have run into situations where 
>> commas are necessary, such as names of places.  I agree with the work-around 
>> and I've
>> done that before.   As long as quotes (") are included around the values, it
>> should work, and I believe it was working for a while.
>>
>> - John
>>
>> ***********************************
>> John Callahan, Research Scientist
>> Delaware Geological Survey
>> University of Delaware
>> URL: http://www.dgs.udel.edu
>> *******************************
>>
>>
>> On Mon, Jun 20, 2011 at 3:59 PM, Alex Mandel 
>> <tech_...@wildintellect.com<mailto:tech_...@wildintellect.com>>wrote:
>>
>>> On 06/20/2011 12:35 PM, John Callahan wrote:
>>>> Has anyone seen this problem with the Delimited Text plugin?  I am 
>>>> seeing this in today's download of QGIS 1.7 standalone on Windows, 
>>>> and on a
>>> recent
>>>> install through OSGeo4W of 1.8-trunk.
>>>>
>>>> "Delimited text" plugin doesn't allow to load csv file with field 
>>>> with commas
>>>> http://hub.qgis.org/issues/2208
>>>>
>>>> - John
>>>
>>> I have had that problem before, with lots csv import tools (not just 
>>> qgis). Are you using commas to separate the values too? I usually 
>>> have much better success changing that to ; or | so instead of 
>>> "2","test,test","1"
>>> "2";"test,test";"1"
>>>
>>> Easiest way to swap out the delimiter is to use 
>>> OpenOffice/LibreOffice and change it when saving.
>>>
>>> Thanks,
>>> Alex

_______________________________________________
Qgis-user mailing list
Qgis-user@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/qgis-user

-- 
The Open University is incorporated by Royal Charter (RC 000391), an exempt 
charity in England & Wales and a charity registered in Scotland (SC 038302).

_______________________________________________
Qgis-user mailing list
Qgis-user@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/qgis-user

Reply via email to