This might be a better option, since DRILL-6170 will introduce a rigid parsing 
definition. So, different fixed-width files can't leverage the same definition, 
though they might share the same extension. 

Thanks, Andries!

-----Original Message-----
From: Andries Engelbrecht [mailto:aengelbre...@mapr.com] 
Sent: Tuesday, February 20, 2018 7:39 AM
To: user@drill.apache.org
Subject: Re: Fixed-width files

You can also try and see if you can just use the CSV plugin to read a line as 
columns[0] and then use the substr function to pull out the fields in the line.

https://urldefense.proofpoint.com/v2/url?u=http-3A__drill.apache.org_docs_string-2Dmanipulation_-23substr&d=DwIGaQ&c=cskdkSMqhcnjZxdQVpwTXg&r=-cT6otg6lpT_XkmYy7yg3A&m=oItppN_rkOKe_pgJb06T71ul6__8GsXmWQzTOQlCvBc&s=u6-Tx7rmfJQDa3_W3hg7YxojXP3Hf60YPLGHMnD8yLg&e=



Here is a simple example



Simple csv file



[test]$ cat test.csv

col1col2col3





jdbc:drill:zk=localhost:5181> select substr(columns[0],1,4), 
substr(columns[0],5,4), substr(columns[0],9,4) from  
dfs.root.`/data/csv/test/test.csv`;

+---------+---------+---------+

| EXPR$0  | EXPR$1  | EXPR$2  |

+---------+---------+---------+

| col1    | col2    | col3    |

+---------+---------+---------+







--Andries









On 2/20/18, 1:17 AM, "Flavio Pompermaier" <pomperma...@okkam.it> wrote:



    For the moment I've created an improvement issue about this:

    
https://urldefense.proofpoint.com/v2/url?u=https-3A__issues.apache.org_jira_browse_DRILL-2D6170&d=DwIBaQ&c=cskdkSMqhcnjZxdQVpwTXg&r=Q3Oz5l4W5TvDHNLpOqMYE2AgtKWFE937v89GEHyOVDU&m=69ohaJkyhIdPzNBy3ZsqNCTa19XysjZzgmn_XPJ2yXQ&s=ajRYDHKrMFgV1AMW2Q8weYDZtzb7-U5CqR9fML7ihno&e=

    

    On Tue, Feb 20, 2018 at 9:23 AM, Flavio Pompermaier <pomperma...@okkam.it>

    wrote:

    

    > Thanks Paul for this suggestion, I think I'm going to give it a try.

    > Once I've created my EasyFormatPlugin where should I put the produced jar?

    > in which folder within jars directory?

    >

    > On Tue, Feb 20, 2018 at 2:57 AM, Paul Rogers <par0...@yahoo.com.invalid>

    > wrote:

    >

    >> It may be that by "fixed width text", Flavio means a file in which the

    >> text columns are of fixed width: kind of like old-school punch cards.

    >> Drill has no reader for this use case, but if you are a Java programmer,

    >> you can create one. See Drill Pull Request #1114 [1] for one example of a

    >> regex reader along with pointers to a second example I'm building for a

    >> book. Should be easy to adopt this code to take a list of column widths 
in

    >> place of the regex. Actually, you could use the regex with a pattern that

    >> just picks out a fixed number of characters.

    >> Thanks,

    >> - Paul

    >>

    >> [1]  
https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_apache_drill_pull_1114&d=DwIBaQ&c=cskdkSMqhcnjZxdQVpwTXg&r=Q3Oz5l4W5TvDHNLpOqMYE2AgtKWFE937v89GEHyOVDU&m=69ohaJkyhIdPzNBy3ZsqNCTa19XysjZzgmn_XPJ2yXQ&s=-0LdlBnmAXaipanP87yJezn5HPEHQIQVX5izxnNTYFY&e=

    >>

    >>

    >>

    >>

    >>     On Monday, February 19, 2018, 12:52:42 PM PST, Kunal Khatua <

    >> kkha...@mapr.com> wrote:

    >>

    >>  As long as you have delimiters, you should be able to import it as a

    >> regular CSV file. Using views that define the fixed-width nature should

    >> help operators downstream work more efficiently.

    >>

    >> -----Original Message-----

    >> From: Flavio Pompermaier [mailto:pomperma...@okkam.it]

    >> Sent: Monday, February 19, 2018 6:50 AM

    >> To: user@drill.apache.org

    >> Subject: Fixed-width files

    >>

    >> Hi to all,

    >> I'm currently looking for the best solution to load a fixed-width text

    >> file into Drill.

    >> Is there any way right now to do that? Is there anyone that already have

    >> a working connector?

    >> Is it better to implement a brand new FormatPluginConfig or

    >> StoragePluginConfig?

    >>

    >> Best,

    >> Flavio

    >>

    >>

    >

    



Reply via email to