Re: Loading an CSV
I had a similar situation recently in an export from an old legacy application. (90% of my job) It may not apply to your situation, but it sure saves me a ton of time, so it is slightly relevant here. (Windows info only here) The data contained every single char (comma, semicolon, TAB, QUOTES[" '], multiple spaces) that it shouldn't, but would happily export the CSV anyway. "DATA","BELLS "WHISTLES", 'MOATS'; dragons ~ and gypsies ","more data" I found out the app used the systems regional delimiter just like Excel does and a simple system change let us export it as PIPE (|) delimited. I found this info somewhere, but I cant find the link for credits. To change the default on your PC to a pipe rather than a comma bring up your default options window by clicking Start > Settings > Control Panel > Regional Settings. Click the "Number" tab and in the List "Separator" field, replace the current default separator with the one you want to use (pipe symbol | ). Click "OK" to save the change and close. You can now save files as pipe delimited files. In Windows XP there is an extra step involved. When you open the Regional settings window, you have to click customize next to the language choice drop down. That will bring up another window with "Numbers" as the first tab item. Change the "List separator" to whatever you want to use as a delimiter. Why Excel does not have this fearture built in I will never know. Hope that helps. /K -- "The illusion of freedom will continue as long as it's profitable to continue the illusion. At the point where the illusion becomes too expensive to maintain, they will just take down the scenery, they will pull back the curtains, they will move the tables and chairs out of the way and you will see the brick wall at the back of the theater." -Frank Zappa ~| Want to reach the ColdFusion community with something they want? Let them know on the House of Fusion mailing lists Archive: http://www.houseoffusion.com/groups/cf-talk/message.cfm/messageid:329330 Subscription: http://www.houseoffusion.com/groups/cf-talk/subscribe.cfm Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4
Re: Loading an CSV
> Oracle Okay. I was thinking there might be options with an MS datasource. But as that is out of the question, I would go with Barney's suggestion. (Looks interesting. I will have to try it myself). ~| Want to reach the ColdFusion community with something they want? Let them know on the House of Fusion mailing lists Archive: http://www.houseoffusion.com/groups/cf-talk/message.cfm/messageid:329321 Subscription: http://www.houseoffusion.com/groups/cf-talk/subscribe.cfm Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4
Re: Loading an CSV
Ok.. I see what you are saying... Thanks. :) On Tue, Dec 22, 2009 at 9:26 AM, Barney Boisvert wrote: > > Quoting is part of standard CSV, Ostermiller will take care of it. > But you don't need to loop over every character. Once you have your > array, you start combining when you find an item that begins with a > quote, and you stop combining when you find an item that ends with a > quote. In my example, item 2 both starts and ends with a quote, so, > and then item 3 starts with a quote, and item 5 ends with a quote, > which is why you combine 3-5. The quotes in item 4 are escaped, but > you probably don't have any of those (at least your example didn't > illustrate any). > > cheers, > barneyb > > On Tue, Dec 22, 2009 at 9:21 AM, Phillip Vector > wrote: >> >> On Tue, Dec 22, 2009 at 9:09 AM, Barney Boisvert wrote: >>> >>> Use a CSV parsing library, rather than rolling your own. They take >>> care of all that stuff for you. I've used >>> http://ostermiller.org/utils/CSV.html in the past. >> >> I took a look at that and didn't see anything that would be of help >> since it's not standard formatted that I can see. >> >>> If you really want to parse it yourself, you can use listToArray, and >>> then iterate over the array and combine items that are quoted. For >>> example take this line: >>> >>> 1,"barney","boisvert, \"crazy man\", barney","1234 Main Street, Apt 5" >>> >>> When you listToArray it, you'll get this: >>> >>> [ >>> '1', >>> '"barney"', >>> '"boisvert', >>> ' \"crazy man\"', >>> ' barney"', >>> '"1234 Main Street', >>> ' Apt 5"' >>> ] >>> >>> Note the position of the double quotes in the items. What you need to >>> do is find items that START with a double quote, and then combine them >>> with the following elements until you find an element that ENDS with a >>> double quote. You'll need to handle the case when the current item is >>> the whole string (in the case of the second element), and when the >>> ending quote is escaped (the forth item). >>> >>> In this particular case you need to remove the quotes from item 2 >>> (since it's a whole string to itself), and you need to combine items >>> 3, 4 and 5 (3 starts with a quote, 5 ends with a quote). The quotes >>> in item 4 are escaped, so they should be ignored for combination, and >>> then the backslashes should be removed after the fact. Note that >>> you'll need to deal with double-escaping. As you can see, it's a >>> mess, so I'd highly recommend the third-party library. ;) >> >> See, the issue is that there is no escaped quotes to show it's part of >> the field. >> >> 3270,5101650,"Dewey, Cheatum & Howe ", 0 , 0 ,0.00,0.00,9.25,-9.25 >> 3270,5101650,Phillip Vector , 34 ," 3,161.00 >> ",92.97,79.25,61.76,17.50 >> 3270,5101650,"James P. Kardone JR., P.C. ", 0 , 0 ,0.00,0.00,9.25,-9.25 >> >> So I'm not sure how to determine the end quote. I suppose I can set a >> flag if I have started with a quote and unset the flag when I >> encounter another quote... But I was hoping I didn't have to loop over >> every character to do it. >> >> > > ~| Want to reach the ColdFusion community with something they want? Let them know on the House of Fusion mailing lists Archive: http://www.houseoffusion.com/groups/cf-talk/message.cfm/messageid:329320 Subscription: http://www.houseoffusion.com/groups/cf-talk/subscribe.cfm Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4
Re: Loading an CSV
Quoting is part of standard CSV, Ostermiller will take care of it. But you don't need to loop over every character. Once you have your array, you start combining when you find an item that begins with a quote, and you stop combining when you find an item that ends with a quote. In my example, item 2 both starts and ends with a quote, so, and then item 3 starts with a quote, and item 5 ends with a quote, which is why you combine 3-5. The quotes in item 4 are escaped, but you probably don't have any of those (at least your example didn't illustrate any). cheers, barneyb On Tue, Dec 22, 2009 at 9:21 AM, Phillip Vector wrote: > > On Tue, Dec 22, 2009 at 9:09 AM, Barney Boisvert wrote: >> >> Use a CSV parsing library, rather than rolling your own. They take >> care of all that stuff for you. I've used >> http://ostermiller.org/utils/CSV.html in the past. > > I took a look at that and didn't see anything that would be of help > since it's not standard formatted that I can see. > >> If you really want to parse it yourself, you can use listToArray, and >> then iterate over the array and combine items that are quoted. For >> example take this line: >> >> 1,"barney","boisvert, \"crazy man\", barney","1234 Main Street, Apt 5" >> >> When you listToArray it, you'll get this: >> >> [ >> '1', >> '"barney"', >> '"boisvert', >> ' \"crazy man\"', >> ' barney"', >> '"1234 Main Street', >> ' Apt 5"' >> ] >> >> Note the position of the double quotes in the items. What you need to >> do is find items that START with a double quote, and then combine them >> with the following elements until you find an element that ENDS with a >> double quote. You'll need to handle the case when the current item is >> the whole string (in the case of the second element), and when the >> ending quote is escaped (the forth item). >> >> In this particular case you need to remove the quotes from item 2 >> (since it's a whole string to itself), and you need to combine items >> 3, 4 and 5 (3 starts with a quote, 5 ends with a quote). The quotes >> in item 4 are escaped, so they should be ignored for combination, and >> then the backslashes should be removed after the fact. Note that >> you'll need to deal with double-escaping. As you can see, it's a >> mess, so I'd highly recommend the third-party library. ;) > > See, the issue is that there is no escaped quotes to show it's part of > the field. > > 3270,5101650,"Dewey, Cheatum & Howe ", 0 , 0 ,0.00,0.00,9.25,-9.25 > 3270,5101650,Phillip Vector , 34 ," 3,161.00 > ",92.97,79.25,61.76,17.50 > 3270,5101650,"James P. Kardone JR., P.C. ", 0 , 0 ,0.00,0.00,9.25,-9.25 > > So I'm not sure how to determine the end quote. I suppose I can set a > flag if I have started with a quote and unset the flag when I > encounter another quote... But I was hoping I didn't have to loop over > every character to do it. > > ~| Want to reach the ColdFusion community with something they want? Let them know on the House of Fusion mailing lists Archive: http://www.houseoffusion.com/groups/cf-talk/message.cfm/messageid:329319 Subscription: http://www.houseoffusion.com/groups/cf-talk/subscribe.cfm Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4
Re: Loading an CSV
Oracle, but I do not have permission to set up another datasource. So I need to do it by coding. On Tue, Dec 22, 2009 at 9:24 AM, Leigh wrote: > >> Defining another datasource is not allowed in this. > > What database(s) are you using? > > > > > ~| Want to reach the ColdFusion community with something they want? Let them know on the House of Fusion mailing lists Archive: http://www.houseoffusion.com/groups/cf-talk/message.cfm/messageid:329318 Subscription: http://www.houseoffusion.com/groups/cf-talk/subscribe.cfm Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4
Re: Loading an CSV
> Defining another datasource is not allowed in this. What database(s) are you using? ~| Want to reach the ColdFusion community with something they want? Let them know on the House of Fusion mailing lists Archive: http://www.houseoffusion.com/groups/cf-talk/message.cfm/messageid:329317 Subscription: http://www.houseoffusion.com/groups/cf-talk/subscribe.cfm Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=11502.10531.4
Re: Loading an CSV
On Tue, Dec 22, 2009 at 9:09 AM, Barney Boisvert wrote: > > Use a CSV parsing library, rather than rolling your own. They take > care of all that stuff for you. I've used > http://ostermiller.org/utils/CSV.html in the past. I took a look at that and didn't see anything that would be of help since it's not standard formatted that I can see. > If you really want to parse it yourself, you can use listToArray, and > then iterate over the array and combine items that are quoted. For > example take this line: > > 1,"barney","boisvert, \"crazy man\", barney","1234 Main Street, Apt 5" > > When you listToArray it, you'll get this: > > [ > '1', > '"barney"', > '"boisvert', > ' \"crazy man\"', > ' barney"', > '"1234 Main Street', > ' Apt 5"' > ] > > Note the position of the double quotes in the items. What you need to > do is find items that START with a double quote, and then combine them > with the following elements until you find an element that ENDS with a > double quote. You'll need to handle the case when the current item is > the whole string (in the case of the second element), and when the > ending quote is escaped (the forth item). > > In this particular case you need to remove the quotes from item 2 > (since it's a whole string to itself), and you need to combine items > 3, 4 and 5 (3 starts with a quote, 5 ends with a quote). The quotes > in item 4 are escaped, so they should be ignored for combination, and > then the backslashes should be removed after the fact. Note that > you'll need to deal with double-escaping. As you can see, it's a > mess, so I'd highly recommend the third-party library. ;) See, the issue is that there is no escaped quotes to show it's part of the field. 3270,5101650,"Dewey, Cheatum & Howe ", 0 , 0 ,0.00,0.00,9.25,-9.25 3270,5101650,Phillip Vector , 34 ," 3,161.00 ",92.97,79.25,61.76,17.50 3270,5101650,"James P. Kardone JR., P.C. ", 0 , 0 ,0.00,0.00,9.25,-9.25 So I'm not sure how to determine the end quote. I suppose I can set a flag if I have started with a quote and unset the flag when I encounter another quote... But I was hoping I didn't have to loop over every character to do it. ~| Want to reach the ColdFusion community with something they want? Let them know on the House of Fusion mailing lists Archive: http://www.houseoffusion.com/groups/cf-talk/message.cfm/messageid:329316 Subscription: http://www.houseoffusion.com/groups/cf-talk/subscribe.cfm Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=11502.10531.4
Re: Loading an CSV
Defining another datasource is not allowed in this. Yes, I know it's easier to do it that way, but unfortunally it's not an option here. On Tue, Dec 22, 2009 at 9:17 AM, Claude Schneegans wrote: > > Simply define an ODBC datasource using the Microsoft txt ODBC driver. > Then requst the datasaource to get all records. > > ~| Want to reach the ColdFusion community with something they want? Let them know on the House of Fusion mailing lists Archive: http://www.houseoffusion.com/groups/cf-talk/message.cfm/messageid:329315 Subscription: http://www.houseoffusion.com/groups/cf-talk/subscribe.cfm Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4
Re: Loading an CSV
Simply define an ODBC datasource using the Microsoft txt ODBC driver. Then requst the datasaource to get all records. ~| Want to reach the ColdFusion community with something they want? Let them know on the House of Fusion mailing lists Archive: http://www.houseoffusion.com/groups/cf-talk/message.cfm/messageid:329314 Subscription: http://www.houseoffusion.com/groups/cf-talk/subscribe.cfm Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4
Re: Loading an CSV
Use a CSV parsing library, rather than rolling your own. They take care of all that stuff for you. I've used http://ostermiller.org/utils/CSV.html in the past. If you really want to parse it yourself, you can use listToArray, and then iterate over the array and combine items that are quoted. For example take this line: 1,"barney","boisvert, \"crazy man\", barney","1234 Main Street, Apt 5" When you listToArray it, you'll get this: [ '1', '"barney"', '"boisvert', ' \"crazy man\"', ' barney"', '"1234 Main Street', ' Apt 5"' ] Note the position of the double quotes in the items. What you need to do is find items that START with a double quote, and then combine them with the following elements until you find an element that ENDS with a double quote. You'll need to handle the case when the current item is the whole string (in the case of the second element), and when the ending quote is escaped (the forth item). In this particular case you need to remove the quotes from item 2 (since it's a whole string to itself), and you need to combine items 3, 4 and 5 (3 starts with a quote, 5 ends with a quote). The quotes in item 4 are escaped, so they should be ignored for combination, and then the backslashes should be removed after the fact. Note that you'll need to deal with double-escaping. As you can see, it's a mess, so I'd highly recommend the third-party library. ;) cheers, barneyb On Tue, Dec 22, 2009 at 8:51 AM, Phillip Vector wrote: > > I have a CSV that looks like the following... > > 3270,5101650,"Dewey, Cheatum & Howe ", 0 , 0 ,0.00,0.00,9.25,-9.25 > 3270,5101650,Phillip Vector , 34 ," 3,161.00 > ",92.97,79.25,61.76,17.50 > 3270,5101650,"James P. Kardone JR., P.C. ", 0 , 0 > ,0.00,0.00,9.25,-9.25 > > I'm stuck on how to process the fields and load them into an array. > How would I say, "If the field is surrounded by quotes, then ignore > the commas in it"? > > I would think I would need to load it into the array as a field before > I could find out if it's surrounded by quotes. But I can't load it > correctly into the array until I find out if there are quotes around > it. > > Changing the CSV is not an option (I know, it would make things much easier). > > Ideas? > > ~| Want to reach the ColdFusion community with something they want? Let them know on the House of Fusion mailing lists Archive: http://www.houseoffusion.com/groups/cf-talk/message.cfm/messageid:329313 Subscription: http://www.houseoffusion.com/groups/cf-talk/subscribe.cfm Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4