Re: Loading an CSV

2009-12-22 Thread Kevin Pepperman

I had a similar situation recently in an export from an old legacy
application. (90% of my job)

It may not apply to your situation, but it sure saves me a ton of time, so
it is slightly relevant here. (Windows info only here)

The data contained every single char (comma, semicolon, TAB, QUOTES[" '],
multiple spaces) that it shouldn't, but would happily export the CSV anyway.
"DATA","BELLS "WHISTLES", 'MOATS'; dragons ~ and gypsies ","more data"


I found out the app used the systems regional delimiter just like Excel does
and a simple system change let us export it as PIPE (|) delimited.

I found this info somewhere, but I cant find the link for credits.


To change the default on your PC to a pipe rather than a comma bring up your
default options window by clicking Start > Settings > Control Panel >
Regional Settings.

Click the "Number" tab and in the List "Separator" field, replace the
current default separator with the one you want to use (pipe symbol | ).

Click "OK" to save the change and close. You can now save files as pipe
delimited files.


In Windows XP there is an extra step involved.

When you open the Regional settings window, you have to click customize next
to the language choice drop down.

That will bring up another window with "Numbers" as the first tab item.
Change the "List separator" to whatever you want to use as a delimiter.


Why Excel does not have this fearture built in I will never know.

Hope that helps.

/K

-- 
"The illusion of freedom will continue as long as it's profitable to
continue the illusion. At the point where the illusion becomes too expensive
to maintain, they will just take down the scenery, they will pull back the
curtains, they will move the tables and chairs out of the way and you will
see the brick wall at the back of the theater."

-Frank Zappa


~|
Want to reach the ColdFusion community with something they want? Let them know 
on the House of Fusion mailing lists
Archive: 
http://www.houseoffusion.com/groups/cf-talk/message.cfm/messageid:329330
Subscription: http://www.houseoffusion.com/groups/cf-talk/subscribe.cfm
Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4


Re: Loading an CSV

2009-12-22 Thread Leigh

> Oracle

Okay. I was thinking there might be options with an MS datasource. But as that 
is out of the question, I would go with Barney's suggestion. (Looks 
interesting. I will have to try it myself).


  

~|
Want to reach the ColdFusion community with something they want? Let them know 
on the House of Fusion mailing lists
Archive: 
http://www.houseoffusion.com/groups/cf-talk/message.cfm/messageid:329321
Subscription: http://www.houseoffusion.com/groups/cf-talk/subscribe.cfm
Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4


Re: Loading an CSV

2009-12-22 Thread Phillip Vector

Ok.. I see what you are saying... Thanks. :)

On Tue, Dec 22, 2009 at 9:26 AM, Barney Boisvert  wrote:
>
> Quoting is part of standard CSV, Ostermiller will take care of it.
> But you don't need to loop over every character.  Once you have your
> array, you start combining when you find an item that begins with a
> quote, and you stop combining when you find an item that ends with a
> quote.  In my example, item 2 both starts and ends with a quote, so,
> and then item 3 starts with a quote, and item 5 ends with a quote,
> which is why you combine 3-5.  The quotes in item 4 are escaped, but
> you probably don't have any of those (at least your example didn't
> illustrate any).
>
> cheers,
> barneyb
>
> On Tue, Dec 22, 2009 at 9:21 AM, Phillip Vector
>  wrote:
>>
>> On Tue, Dec 22, 2009 at 9:09 AM, Barney Boisvert  wrote:
>>>
>>> Use a CSV parsing library, rather than rolling your own.  They take
>>> care of all that stuff for you.  I've used
>>> http://ostermiller.org/utils/CSV.html in the past.
>>
>> I took a look at that and didn't see anything that would be of help
>> since it's not standard formatted that I can see.
>>
>>> If you really want to parse it yourself, you can use listToArray, and
>>> then iterate over the array and combine items that are quoted.  For
>>> example take this line:
>>>
>>> 1,"barney","boisvert, \"crazy man\", barney","1234 Main Street, Apt 5"
>>>
>>> When you listToArray it, you'll get this:
>>>
>>> [
>>>  '1',
>>>  '"barney"',
>>>  '"boisvert',
>>>  ' \"crazy man\"',
>>>  ' barney"',
>>>  '"1234 Main Street',
>>>  ' Apt 5"'
>>> ]
>>>
>>> Note the position of the double quotes in the items.  What you need to
>>> do is find items that START with a double quote, and then combine them
>>> with the following elements until you find an element that ENDS with a
>>> double quote.  You'll need to handle the case when the current item is
>>> the whole string (in the case of the second element), and when the
>>> ending quote is escaped (the forth item).
>>>
>>> In this particular case you need to remove the quotes from item 2
>>> (since it's a whole string to itself), and you need to combine items
>>> 3, 4 and 5 (3 starts with a quote, 5 ends with a quote).  The quotes
>>> in item 4 are escaped, so they should be ignored for combination, and
>>> then the backslashes should be removed after the fact.  Note that
>>> you'll need to deal with double-escaping.  As you can see, it's a
>>> mess, so I'd highly recommend the third-party library.  ;)
>>
>> See, the issue is that there is no escaped quotes to show it's part of
>> the field.
>>
>> 3270,5101650,"Dewey, Cheatum & Howe ", 0 , 0 ,0.00,0.00,9.25,-9.25
>> 3270,5101650,Phillip Vector , 34 ," 3,161.00
>> ",92.97,79.25,61.76,17.50
>> 3270,5101650,"James P. Kardone JR., P.C. ", 0 , 0 ,0.00,0.00,9.25,-9.25
>>
>> So I'm not sure how to determine the end quote. I suppose I can set a
>> flag if I have started with a quote and unset the flag when I
>> encounter another quote... But I was hoping I didn't have to loop over
>> every character to do it.
>>
>>
>
> 

~|
Want to reach the ColdFusion community with something they want? Let them know 
on the House of Fusion mailing lists
Archive: 
http://www.houseoffusion.com/groups/cf-talk/message.cfm/messageid:329320
Subscription: http://www.houseoffusion.com/groups/cf-talk/subscribe.cfm
Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4


Re: Loading an CSV

2009-12-22 Thread Barney Boisvert

Quoting is part of standard CSV, Ostermiller will take care of it.
But you don't need to loop over every character.  Once you have your
array, you start combining when you find an item that begins with a
quote, and you stop combining when you find an item that ends with a
quote.  In my example, item 2 both starts and ends with a quote, so,
and then item 3 starts with a quote, and item 5 ends with a quote,
which is why you combine 3-5.  The quotes in item 4 are escaped, but
you probably don't have any of those (at least your example didn't
illustrate any).

cheers,
barneyb

On Tue, Dec 22, 2009 at 9:21 AM, Phillip Vector
 wrote:
>
> On Tue, Dec 22, 2009 at 9:09 AM, Barney Boisvert  wrote:
>>
>> Use a CSV parsing library, rather than rolling your own.  They take
>> care of all that stuff for you.  I've used
>> http://ostermiller.org/utils/CSV.html in the past.
>
> I took a look at that and didn't see anything that would be of help
> since it's not standard formatted that I can see.
>
>> If you really want to parse it yourself, you can use listToArray, and
>> then iterate over the array and combine items that are quoted.  For
>> example take this line:
>>
>> 1,"barney","boisvert, \"crazy man\", barney","1234 Main Street, Apt 5"
>>
>> When you listToArray it, you'll get this:
>>
>> [
>>  '1',
>>  '"barney"',
>>  '"boisvert',
>>  ' \"crazy man\"',
>>  ' barney"',
>>  '"1234 Main Street',
>>  ' Apt 5"'
>> ]
>>
>> Note the position of the double quotes in the items.  What you need to
>> do is find items that START with a double quote, and then combine them
>> with the following elements until you find an element that ENDS with a
>> double quote.  You'll need to handle the case when the current item is
>> the whole string (in the case of the second element), and when the
>> ending quote is escaped (the forth item).
>>
>> In this particular case you need to remove the quotes from item 2
>> (since it's a whole string to itself), and you need to combine items
>> 3, 4 and 5 (3 starts with a quote, 5 ends with a quote).  The quotes
>> in item 4 are escaped, so they should be ignored for combination, and
>> then the backslashes should be removed after the fact.  Note that
>> you'll need to deal with double-escaping.  As you can see, it's a
>> mess, so I'd highly recommend the third-party library.  ;)
>
> See, the issue is that there is no escaped quotes to show it's part of
> the field.
>
> 3270,5101650,"Dewey, Cheatum & Howe ", 0 , 0 ,0.00,0.00,9.25,-9.25
> 3270,5101650,Phillip Vector , 34 ," 3,161.00
> ",92.97,79.25,61.76,17.50
> 3270,5101650,"James P. Kardone JR., P.C. ", 0 , 0 ,0.00,0.00,9.25,-9.25
>
> So I'm not sure how to determine the end quote. I suppose I can set a
> flag if I have started with a quote and unset the flag when I
> encounter another quote... But I was hoping I didn't have to loop over
> every character to do it.
>
> 

~|
Want to reach the ColdFusion community with something they want? Let them know 
on the House of Fusion mailing lists
Archive: 
http://www.houseoffusion.com/groups/cf-talk/message.cfm/messageid:329319
Subscription: http://www.houseoffusion.com/groups/cf-talk/subscribe.cfm
Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4


Re: Loading an CSV

2009-12-22 Thread Phillip Vector

Oracle, but I do not have permission to set up another datasource. So
I need to do it by coding.

On Tue, Dec 22, 2009 at 9:24 AM, Leigh  wrote:
>
>> Defining another datasource is not allowed in this.
>
> What database(s) are you using?
>
>
>
>
> 

~|
Want to reach the ColdFusion community with something they want? Let them know 
on the House of Fusion mailing lists
Archive: 
http://www.houseoffusion.com/groups/cf-talk/message.cfm/messageid:329318
Subscription: http://www.houseoffusion.com/groups/cf-talk/subscribe.cfm
Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4


Re: Loading an CSV

2009-12-22 Thread Leigh

> Defining another datasource is not allowed in this. 

What database(s) are you using? 


  

~|
Want to reach the ColdFusion community with something they want? Let them know 
on the House of Fusion mailing lists
Archive: 
http://www.houseoffusion.com/groups/cf-talk/message.cfm/messageid:329317
Subscription: http://www.houseoffusion.com/groups/cf-talk/subscribe.cfm
Unsubscribe: 
http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=11502.10531.4


Re: Loading an CSV

2009-12-22 Thread Phillip Vector

On Tue, Dec 22, 2009 at 9:09 AM, Barney Boisvert  wrote:
>
> Use a CSV parsing library, rather than rolling your own.  They take
> care of all that stuff for you.  I've used
> http://ostermiller.org/utils/CSV.html in the past.

I took a look at that and didn't see anything that would be of help
since it's not standard formatted that I can see.

> If you really want to parse it yourself, you can use listToArray, and
> then iterate over the array and combine items that are quoted.  For
> example take this line:
>
> 1,"barney","boisvert, \"crazy man\", barney","1234 Main Street, Apt 5"
>
> When you listToArray it, you'll get this:
>
> [
>  '1',
>  '"barney"',
>  '"boisvert',
>  ' \"crazy man\"',
>  ' barney"',
>  '"1234 Main Street',
>  ' Apt 5"'
> ]
>
> Note the position of the double quotes in the items.  What you need to
> do is find items that START with a double quote, and then combine them
> with the following elements until you find an element that ENDS with a
> double quote.  You'll need to handle the case when the current item is
> the whole string (in the case of the second element), and when the
> ending quote is escaped (the forth item).
>
> In this particular case you need to remove the quotes from item 2
> (since it's a whole string to itself), and you need to combine items
> 3, 4 and 5 (3 starts with a quote, 5 ends with a quote).  The quotes
> in item 4 are escaped, so they should be ignored for combination, and
> then the backslashes should be removed after the fact.  Note that
> you'll need to deal with double-escaping.  As you can see, it's a
> mess, so I'd highly recommend the third-party library.  ;)

See, the issue is that there is no escaped quotes to show it's part of
the field.

3270,5101650,"Dewey, Cheatum & Howe ", 0 , 0 ,0.00,0.00,9.25,-9.25
3270,5101650,Phillip Vector , 34 ," 3,161.00
",92.97,79.25,61.76,17.50
3270,5101650,"James P. Kardone JR., P.C. ", 0 , 0 ,0.00,0.00,9.25,-9.25

So I'm not sure how to determine the end quote. I suppose I can set a
flag if I have started with a quote and unset the flag when I
encounter another quote... But I was hoping I didn't have to loop over
every character to do it.

~|
Want to reach the ColdFusion community with something they want? Let them know 
on the House of Fusion mailing lists
Archive: 
http://www.houseoffusion.com/groups/cf-talk/message.cfm/messageid:329316
Subscription: http://www.houseoffusion.com/groups/cf-talk/subscribe.cfm
Unsubscribe: 
http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=11502.10531.4


Re: Loading an CSV

2009-12-22 Thread Phillip Vector

Defining another datasource is not allowed in this. Yes, I know it's
easier to do it that way, but unfortunally it's not an option here.

On Tue, Dec 22, 2009 at 9:17 AM, Claude Schneegans
 wrote:
>
> Simply define an ODBC datasource using the Microsoft txt ODBC driver.
> Then requst the datasaource to get all records.
>
> 

~|
Want to reach the ColdFusion community with something they want? Let them know 
on the House of Fusion mailing lists
Archive: 
http://www.houseoffusion.com/groups/cf-talk/message.cfm/messageid:329315
Subscription: http://www.houseoffusion.com/groups/cf-talk/subscribe.cfm
Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4


Re: Loading an CSV

2009-12-22 Thread Claude Schneegans

Simply define an ODBC datasource using the Microsoft txt ODBC driver.
Then requst the datasaource to get all records.

~|
Want to reach the ColdFusion community with something they want? Let them know 
on the House of Fusion mailing lists
Archive: 
http://www.houseoffusion.com/groups/cf-talk/message.cfm/messageid:329314
Subscription: http://www.houseoffusion.com/groups/cf-talk/subscribe.cfm
Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4


Re: Loading an CSV

2009-12-22 Thread Barney Boisvert

Use a CSV parsing library, rather than rolling your own.  They take
care of all that stuff for you.  I've used
http://ostermiller.org/utils/CSV.html in the past.

If you really want to parse it yourself, you can use listToArray, and
then iterate over the array and combine items that are quoted.  For
example take this line:

1,"barney","boisvert, \"crazy man\", barney","1234 Main Street, Apt 5"

When you listToArray it, you'll get this:

[
 '1',
 '"barney"',
 '"boisvert',
 ' \"crazy man\"',
 ' barney"',
 '"1234 Main Street',
 ' Apt 5"'
]

Note the position of the double quotes in the items.  What you need to
do is find items that START with a double quote, and then combine them
with the following elements until you find an element that ENDS with a
double quote.  You'll need to handle the case when the current item is
the whole string (in the case of the second element), and when the
ending quote is escaped (the forth item).

In this particular case you need to remove the quotes from item 2
(since it's a whole string to itself), and you need to combine items
3, 4 and 5 (3 starts with a quote, 5 ends with a quote).  The quotes
in item 4 are escaped, so they should be ignored for combination, and
then the backslashes should be removed after the fact.  Note that
you'll need to deal with double-escaping.  As you can see, it's a
mess, so I'd highly recommend the third-party library.  ;)

cheers,
barneyb

On Tue, Dec 22, 2009 at 8:51 AM, Phillip Vector
 wrote:
>
> I have a CSV that looks like the following...
>
> 3270,5101650,"Dewey, Cheatum & Howe ", 0   , 0   ,0.00,0.00,9.25,-9.25
> 3270,5101650,Phillip Vector                , 34 ," 3,161.00
> ",92.97,79.25,61.76,17.50
> 3270,5101650,"James P. Kardone JR., P.C.     ", 0   , 0   
> ,0.00,0.00,9.25,-9.25
>
> I'm stuck on how to process the fields and load them into an array.
> How would I say, "If the field is surrounded by quotes, then ignore
> the commas in it"?
>
> I would think I would need to load it into the array as a field before
> I could find out if it's surrounded by quotes. But I can't load it
> correctly into the array until I find out if there are quotes around
> it.
>
> Changing the CSV is not an option (I know, it would make things much easier).
>
> Ideas?
>
> 

~|
Want to reach the ColdFusion community with something they want? Let them know 
on the House of Fusion mailing lists
Archive: 
http://www.houseoffusion.com/groups/cf-talk/message.cfm/messageid:329313
Subscription: http://www.houseoffusion.com/groups/cf-talk/subscribe.cfm
Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4