[ 
https://issues.apache.org/jira/browse/DRILL-2760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steven Phillips resolved DRILL-2760.
------------------------------------
    Resolution: Fixed

Fixed before 48d8a59

> Quoted strings from CSV file appear in query output in different forms
> ----------------------------------------------------------------------
>
>                 Key: DRILL-2760
>                 URL: https://issues.apache.org/jira/browse/DRILL-2760
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Storage - Text & CSV
>    Affects Versions: 0.9.0
>         Environment: | 9d92b8e319f2d46e8659d903d355450e15946533 | DRILL-2580: 
> Exit early from HashJoinBatch if build side is empty | 26.03.2015 @ 16:13:53 
> EDT
> 4 node cluster on CentOS
>            Reporter: Khurram Faraaz
>            Assignee: Steven Phillips
>             Fix For: 1.2.0
>
>
> Quoted strings appear in query output in different forms, as shown in the 
> section below.
> Quotes should NOT appear in query output. Strings must be stripped of their 
> leading and prevailing quotes. (I am referring to this character - " )
> Snippet of data from airports.cv file, first three lines, the first line has 
> header information.
> {code}
> [root@centos-01 airport_CSV_data]# head -3 airports.csv
> "id","ident","type","name","latitude_deg","longitude_deg","elevation_ft","continent","iso_country","iso_region","municipality","scheduled_service","gps_code","iata_code","local_code","home_link","wikipedia_link","keywords"
> 6523,"00A","heliport","Total Rf 
> Heliport",40.07080078125,-74.9336013793945,11,"NA","US","US-PA","Bensalem","no","00A",,"00A",,,
> 6524,"00AK","small_airport","Lowell 
> Field",59.94919968,-151.695999146,450,"NA","US","US-AK","Anchor 
> Point","no","00AK",,"00AK",,,
> {code}
> case 1) In this case quotes are not escaped, they appear in the output as is.
> {code}
> 0: jdbc:drill:> select columns[0] id,columns[1] ident,columns[2] 
> type,columns[3] name,columns[4] latitude_deg,columns[5] 
> longitude_deg,columns[6] elevation_ft,columns[7] continent,columns[8] 
> iso_country,columns[9] iso_region,columns[10] municipality,columns[11] 
> scheduled_service,columns[12] gps_code,columns[13] iata_code, columns[14] 
> local_code,columns[15] home_link,columns[16] wikipedia_link,columns[17] 
> keywords from `airports.csv` limit 3;
> +------------+------------+------------+------------+--------------+---------------+--------------+------------+-------------+------------+--------------+-------------------+------------+------------+------------+------------+----------------+------------+
> |     id     |   ident    |    type    |    name    | latitude_deg | 
> longitude_deg | elevation_ft | continent  | iso_country | iso_region | 
> municipality | scheduled_service |  gps_code  | iata_code  | local_code | 
> home_link  | wikipedia_link |  keywords  |
> +------------+------------+------------+------------+--------------+---------------+--------------+------------+-------------+------------+--------------+-------------------+------------+------------+------------+------------+----------------+------------+
> | "id"       | "ident"    | "type"     | "name"     | "latitude_deg" | 
> "longitude_deg" | "elevation_ft" | "continent" | "iso_country" | "iso_region" 
> | "municipality" | "scheduled_service" | "gps_code" | "iata_code" | 
> "local_code" | "home_link" | "wikipedia_link" | "keywords" |
> | 6523       | "00A"      | "heliport" | "Total Rf Heliport" | 40.07080078125 
> | -74.9336013793945 | 11           | "NA"       | "US"        | "US-PA"    | 
> "Bensalem"   | "no"              | "00A"      |            | "00A"      |     
>        |                | null       |
> | 6524       | "00AK"     | "small_airport" | "Lowell Field" | 59.94919968  | 
> -151.695999146 | 450          | "NA"       | "US"        | "US-AK"    | 
> "Anchor Point" | "no"              | "00AK"     |            | "00AK"     |   
>          |                | null       |
> +------------+------------+------------+------------+--------------+---------------+--------------+------------+-------------+------------+--------------+-------------------+------------+------------+------------+------------+----------------+------------+
> 3 rows selected (0.155 seconds)
> {code}
> In this case quotes appear in the query output but they are escaped with 
> backslash character in the output.
> {code}
> 0: jdbc:drill:> select * from `airports.csv` limit 3;
> +------------+
> |  columns   |
> +------------+
> | 
> ["\"id\"","\"ident\"","\"type\"","\"name\"","\"latitude_deg\"","\"longitude_deg\"","\"elevation_ft\"","\"continent\"","\"iso_country\"","\"iso_region\"","\"municipality\"","\"scheduled_service\"","\"gps_code\"","\"iata_code\"","\"local_code\"","\"home_link\"","\"wikipedia_link\"","\"keywords\""]
>  |
> | ["6523","\"00A\"","\"heliport\"","\"Total Rf 
> Heliport\"","40.07080078125","-74.9336013793945","11","\"NA\"","\"US\"","\"US-PA\"","\"Bensalem\"","\"no\"","\"00A\"","","\"00A\"","",""]
>  |
> | ["6524","\"00AK\"","\"small_airport\"","\"Lowell 
> Field\"","59.94919968","-151.695999146","450","\"NA\"","\"US\"","\"US-AK\"","\"Anchor
>  Point\"","\"no\"","\"00AK\"","","\"00AK\"","",""] |
> +------------+
> 3 rows selected (0.097 seconds)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to