This is a reasonable hack for some cases, but I'm pretty sure this is going
to break the most common purpose of having quotes at all. If you put the
delimiter (tab) between quotes you are going to have it splitting on those
characters where it shouldn't be. There is also the issue that  the quotes
are not longer being removed at the scan level, so if you try to do
something like cast this output you will get errors without using a case
statement to appropriately remove the quotes when necessary.

On Fri, Jun 26, 2015 at 11:16 AM, Hao Zhu <h...@maprtech.com> wrote:

> I can reproduce the issue but I also have a workaround for it:
> *1. When storage plugin for "tsv" is default:*
>     "tsv": {
>       "type": "text",
>       "extensions": [
>         "tsv"
>       ],
>       "delimiter": "\t"
>     },
>
> > select columns[0],columns[1] from `test.tsv`;
> +----------+---------+
> |  EXPR$0  | EXPR$1  |
> +----------+---------+
> | foobar   | bar     |
> | aa" "bc  | null    |
> +----------+---------+
> 2 rows selected (0.114 seconds)
>
> *2. If we add "quote" property to use single quote:*
>     "tsv": {
>       "type": "text",
>       "extensions": [
>         "tsv"
>       ],
>       "quote": "'",
>       "delimiter": "\t"
>     },
>
> Then it works fine:
> > select columns[0],columns[1] from `test.tsv`;
> +---------+---------+
> | EXPR$0  | EXPR$1  |
> +---------+---------+
> | foobar  | bar     |
> | "aa"    | "bc"    |
> +---------+---------+
> 2 rows selected (0.11 seconds)
>
>
>
> Thanks,
> Hao
>
>
> On Fri, Jun 26, 2015 at 11:03 AM, Kristine Hahn <kh...@maprtech.com>
> wrote:
>
> > I think you might have a problem with your tsv file using spaces
> > instead of tabs.
> > CSV file contents:
> > hello,1,2,3
> > hello,1,2,3
> > hello,1,2,3
> >
> > TSV file contents (actual tab character, not spaces):
> > hello 1 2 3
> > hello 1 2 3
> > hello 1 2 3
> >
> > 0: jdbc:drill:zk=local> select * from
> > `/Users/khahn/Downloads/csv_test.csv`;
> > +------------------------+
> > |        columns         |
> > +------------------------+
> > | ["hello","1","2","3"]  |
> > | ["hello","1","2","3"]  |
> > | ["hello","1","2","3"]  |
> > +------------------------+
> > 3 rows selected (0.114 seconds)
> >
> > TSV using tabs
> > 0: jdbc:drill:zk=local> select * from
> > `/Users/khahn/Downloads/tsv_test.tsv`;
> > +------------------------+
> > |        columns         |
> > +------------------------+
> > | ["hello","1","2","3"]  |
> > | ["hello","1","2","3"]  |
> > | ["hello","1","2","3"]  |
> > +------------------------+
> > 3 rows selected (0.122 seconds)
> >
> > TSV using spaces
> >
> > 0: jdbc:drill:zk=local> select * from
> > `/Users/khahn/Downloads/tsv_test.tsv`;
> > +------------------------+
> > |        columns         |
> > +------------------------+
> > | ["hello   1   2   3"]  |
> > | ["hello   1   2   3"]  |
> > | ["hello   1   2   3"]  |
> > +------------------------+
> > 3 rows selected (0.117 seconds)
> > Kristine Hahn
> > Sr. Technical Writer
> > 415-497-8107 @krishahn
> >
> >
> >
> > On Fri, Jun 26, 2015 at 10:02 AM, Kristine Hahn <kh...@maprtech.com>
> > wrote:
> > > There are some attributes that were introduced in Drill 1.0 that are
> > partly
> > > documented (sorry no example):
> > >
> > >
> >
> http://drill.apache.org/docs/plugin-configuration-basics/#list-of-attributes-and-definitions
> > > (see "formats" . . . "quote")
> > >
> >
> http://drill.apache.org/docs/plugin-configuration-basics/#using-the-formats
> > >
> > >
> > >
> > >
> > >
> > >
> > > Kristine Hahn
> > > Sr. Technical Writer
> > > 415-497-8107 @krishahn
> > >
> > >
> > > On Fri, Jun 26, 2015 at 7:27 AM, Chi-Lang Ngo <chil...@gmail.com>
> wrote:
> > >>
> > >> Hi,
> > >>
> > >> I'm having problem querying tab-delimited (tsv) files which has
> quotes.
> > >>
> > >> Drill doesn't seem to recognise quotes in tsv while working fine for
> csv
> > >> files.
> > >> For example, given the following files
> > >>
> > >> test.tsv
> > >> -------
> > >> foobar bar
> > >> "aa" "bc"
> > >> -------
> > >>
> > >> test.csv
> > >> ----------
> > >> foobar,bar
> > >> "aa","bc"
> > >> ----------
> > >>
> > >> I get these results
> > >>
> > >> 0: jdbc:drill:zk=local> select columns[0], columns[1] from
> > >> dfs.`/test.csv`;
> > >>
> > >> +---------+---------+
> > >>
> > >> | EXPR$0  | EXPR$1  |
> > >>
> > >> +---------+---------+
> > >>
> > >> | foobar  | bar     |
> > >>
> > >> | aa      | bc      |
> > >>
> > >> +---------+---------+
> > >>
> > >> 2 rows selected (0.259 seconds)
> > >>
> > >> 0: jdbc:drill:zk=local> select columns[0], columns[1] from
> > >> dfs.`/test.tsv`;
> > >>
> > >> +----------+---------+
> > >>
> > >> |  EXPR$0  | EXPR$1  |
> > >>
> > >> +----------+---------+
> > >>
> > >> | foobar   | bar     |
> > >>
> > >> | aa" "bc  | null    |
> > >>
> > >> +----------+---------+
> > >>
> > >> 2 rows selected (0.122 seconds)
> > >>
> > >> Any ideas?
> > >> CL
> > >
> > >
> >
>

Reply via email to