I can reproduce the issue but I also have a workaround for it:
*1. When storage plugin for "tsv" is default:*
    "tsv": {
      "type": "text",
      "extensions": [
        "tsv"
      ],
      "delimiter": "\t"
    },

> select columns[0],columns[1] from `test.tsv`;
+----------+---------+
|  EXPR$0  | EXPR$1  |
+----------+---------+
| foobar   | bar     |
| aa" "bc  | null    |
+----------+---------+
2 rows selected (0.114 seconds)

*2. If we add "quote" property to use single quote:*
    "tsv": {
      "type": "text",
      "extensions": [
        "tsv"
      ],
      "quote": "'",
      "delimiter": "\t"
    },

Then it works fine:
> select columns[0],columns[1] from `test.tsv`;
+---------+---------+
| EXPR$0  | EXPR$1  |
+---------+---------+
| foobar  | bar     |
| "aa"    | "bc"    |
+---------+---------+
2 rows selected (0.11 seconds)



Thanks,
Hao


On Fri, Jun 26, 2015 at 11:03 AM, Kristine Hahn <[email protected]> wrote:

> I think you might have a problem with your tsv file using spaces
> instead of tabs.
> CSV file contents:
> hello,1,2,3
> hello,1,2,3
> hello,1,2,3
>
> TSV file contents (actual tab character, not spaces):
> hello 1 2 3
> hello 1 2 3
> hello 1 2 3
>
> 0: jdbc:drill:zk=local> select * from
> `/Users/khahn/Downloads/csv_test.csv`;
> +------------------------+
> |        columns         |
> +------------------------+
> | ["hello","1","2","3"]  |
> | ["hello","1","2","3"]  |
> | ["hello","1","2","3"]  |
> +------------------------+
> 3 rows selected (0.114 seconds)
>
> TSV using tabs
> 0: jdbc:drill:zk=local> select * from
> `/Users/khahn/Downloads/tsv_test.tsv`;
> +------------------------+
> |        columns         |
> +------------------------+
> | ["hello","1","2","3"]  |
> | ["hello","1","2","3"]  |
> | ["hello","1","2","3"]  |
> +------------------------+
> 3 rows selected (0.122 seconds)
>
> TSV using spaces
>
> 0: jdbc:drill:zk=local> select * from
> `/Users/khahn/Downloads/tsv_test.tsv`;
> +------------------------+
> |        columns         |
> +------------------------+
> | ["hello   1   2   3"]  |
> | ["hello   1   2   3"]  |
> | ["hello   1   2   3"]  |
> +------------------------+
> 3 rows selected (0.117 seconds)
> Kristine Hahn
> Sr. Technical Writer
> 415-497-8107 @krishahn
>
>
>
> On Fri, Jun 26, 2015 at 10:02 AM, Kristine Hahn <[email protected]>
> wrote:
> > There are some attributes that were introduced in Drill 1.0 that are
> partly
> > documented (sorry no example):
> >
> >
> http://drill.apache.org/docs/plugin-configuration-basics/#list-of-attributes-and-definitions
> > (see "formats" . . . "quote")
> >
> http://drill.apache.org/docs/plugin-configuration-basics/#using-the-formats
> >
> >
> >
> >
> >
> >
> > Kristine Hahn
> > Sr. Technical Writer
> > 415-497-8107 @krishahn
> >
> >
> > On Fri, Jun 26, 2015 at 7:27 AM, Chi-Lang Ngo <[email protected]> wrote:
> >>
> >> Hi,
> >>
> >> I'm having problem querying tab-delimited (tsv) files which has quotes.
> >>
> >> Drill doesn't seem to recognise quotes in tsv while working fine for csv
> >> files.
> >> For example, given the following files
> >>
> >> test.tsv
> >> -------
> >> foobar bar
> >> "aa" "bc"
> >> -------
> >>
> >> test.csv
> >> ----------
> >> foobar,bar
> >> "aa","bc"
> >> ----------
> >>
> >> I get these results
> >>
> >> 0: jdbc:drill:zk=local> select columns[0], columns[1] from
> >> dfs.`/test.csv`;
> >>
> >> +---------+---------+
> >>
> >> | EXPR$0  | EXPR$1  |
> >>
> >> +---------+---------+
> >>
> >> | foobar  | bar     |
> >>
> >> | aa      | bc      |
> >>
> >> +---------+---------+
> >>
> >> 2 rows selected (0.259 seconds)
> >>
> >> 0: jdbc:drill:zk=local> select columns[0], columns[1] from
> >> dfs.`/test.tsv`;
> >>
> >> +----------+---------+
> >>
> >> |  EXPR$0  | EXPR$1  |
> >>
> >> +----------+---------+
> >>
> >> | foobar   | bar     |
> >>
> >> | aa" "bc  | null    |
> >>
> >> +----------+---------+
> >>
> >> 2 rows selected (0.122 seconds)
> >>
> >> Any ideas?
> >> CL
> >
> >
>

Reply via email to