[ 
https://issues.apache.org/jira/browse/DRILL-8439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17728079#comment-17728079
 ] 

Charles Givre commented on DRILL-8439:
--------------------------------------

Can you please verify in the CSV file that the affected column doesn't have any 
other leading characters?  Please check for carriage returns, and other 
invisible unicode characters.  The fact that Drill is inserting an extra 
underscore leads me to believe there could be some extra garbage in that field.

In any event, can't you just query this by giving it an alias?

IE:

{{SELECT `col__PRODUCTID_` AS product_id ...}}

> Getting col__ prefix for columns that are not special when extractHeader is 
> enabled
> -----------------------------------------------------------------------------------
>
>                 Key: DRILL-8439
>                 URL: https://issues.apache.org/jira/browse/DRILL-8439
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Metadata, SQL Parser
>    Affects Versions: 1.21.0
>         Environment: Enabled {{extractHeader}} in the csv config of dfs 
> plugin.
> No. of drillbits: Single
> OS: Windows
>            Reporter: Diksha Chaturvedi
>            Priority: Major
>              Labels: drill, extractHeader
>
> As per documentation, Drill appends col_ to the columns that start with a 
> number or special characters.
> {code:java}
> /**
>  * Prefix used to replace non-alphabetic characters at the start of
>  * a column name. For example, $foo becomes col_foo. Used
>  * because SQL does not allow _foo.
>  */
> public static final String COLUMN_PREFIX = "col_";
> {code}
> But in my case I'm getting it even for all alphabetical column name.
> ----
> I have the following data in the CSV file,
> ||PRODUCTID||PRODUCTNAME||SUPPLIERID||CATEGORYID||UNIT||PRICE||
> |1|Chais|1|1|10 boxes x 20 bags|18|
> |2|Chang|1|1|24 - 12 oz bottles|19|
> |3|Aniseed Syrup|1|2|12 - 550 ml bottles|10|
> |4|Chef Anton's Cajun Seasoning|2|2|48 - 6 oz jars|22|
> |5|Chef Anton's Gumbo Mix|2|2|36 boxes|21.35|
>  
> While querying on the csv file using following query:
> {code:sql}
> SELECT * FROM dfs.`/var/lib/PRODUCT.csv`{code}
> The output is 
> [!https://i.stack.imgur.com/FBNmn.png|width=611,height=130!|https://i.stack.imgur.com/FBNmn.png]
> ----
> I know about other criterias like
> {{#UNITS}} is changed to {{col_UNITS}}
> {{FINANCIAL$RECORD}} is changed to {{FINANCIAL_RECORD}}
> But what's with {{{}PRODUCTID{}}}; Why is it changed to 
> {{col___PRODUCTID__}}? In this case it has appended extra underscores also. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to