[ 
http://issues.apache.org/jira/browse/SOLR-66?page=comments#action_12447877 ] 
            
Fuad Efendi commented on SOLR-66:
---------------------------------

Encoding:
How to encode 'comma'?
How to encode UTF-8?
Should we use Base64 and encode raw values?

http://rfc.net/rfc4180.html:
"Common usage of CSV is US-ASCII, but other character sets defined by IANA for 
the "text" tree may be used in conjunction with the  "charset" parameter.

http://www.creativyst.com/Doc/Articles/CSV/CSV01.htm
http://www.edoceo.com/utilis/csv-file-format.php
http://www.ricebridge.com/products/csvman/reference.htm

This is interesting (from last link):
FIELD:    [trim]? ( UNQUOTED | QUOTED ) [trim]? 
UNQUOTED: ( [data]* | ESCAPE )*;
QUOTED:   [quote] ( DOUBLE | ESCAPE | [data]* )* [quote]




> bulk data loader
> ----------------
>
>                 Key: SOLR-66
>                 URL: http://issues.apache.org/jira/browse/SOLR-66
>             Project: Solr
>          Issue Type: New Feature
>            Reporter: Yonik Seeley
>         Assigned To: Yonik Seeley
>
> A way to efficiently load simple formatted text files, including CSV files.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to