I recommend the opencsv library for Java or the csv package for Python.
Either one can write legal CSV files.

There are lots of corner cases in CSV and some differences between
applications, like whetehr newlines are allowed inside a quoted field.
It is best to use a library for this instead of hacking at it.

We should use opencsv in Solr, too: http://opencsv.sourceforge.net/
It is under the Apache 2.0 license.

If you really want to write it yourself, here is a Python routine
that I used before finding the csv package:

def csvsafe(s):
    if not s: return '""'

    # normalize all whitespace to single spaces
    s = ' '.join(s.split())
    s = s.strip()
    if not s: return '""'
    
    # quote the quotes
    s = s.replace('"','""')
    
    return '"'+s+'"'

wunder

On 1/4/08 1:08 AM, "Michael Lackhoff" <[EMAIL PROTECTED]> wrote:

> On 03.01.2008 17:16 Yonik Seeley wrote:
> 
>> CSV doesn't use backslash escaping.
>> http://www.creativyst.com/Doc/Articles/CSV/CSV01.htm
>> 
>> "This is text with a ""quoted"" string"
> 
> Thanks for the hint but the result is the same, that is, ""quoted""
> behaves exactly like \"quoted\":
> - both leave the single unescaped quote in the record: "quoted"
> - both have the problem with a backslash before the escaped quote:
>   "This is text with a \""quoted"" string" gives an error "invalid
>   char between encapsualted token end delimiter".
> 
> So, is it possible to get a record into the index with csv that
> originally looks like this?:
> This is text with an unusual \"combination" of characters
> 
> A single quote is no problem: just double it (" -> "").
> A single backslash is no problem: just leave it alone (\ -> \)
> But what about a backslash followed by a quote (\" -> ???)
> 
> -Michael
> 

Reply via email to