Re: Report issues with sensitive data

2015-04-03 Thread Alexander Reshetov
Hi, Andries, Ted, thanks for quick replies. Yes, I'm using latest official build of 0.8. I made some investigations of possible issues and also found way to hide sensitive data. Please see issue regarding this [1]. In that process I found one strange behavior which I assume lead to this issue. (

Re: Parquet File Weirdness

2015-04-03 Thread Steven Phillips
Parquet has a few primitive types, one of which is Binary array. These primitive types are used to store different "converted types". For example, one of the converted types that uses binary array is "UTF8" string. I believe that the parquet files you are querying do not have the "converted type" s

Re: Parquet File Weirdness

2015-04-03 Thread Andries Engelbrecht
Are you reading the data using the Hive Storage plugin for Drill and using the Metastore, or are you directly querying the parquet files on the filesystem with Drill? —Andries On Apr 3, 2015, at 12:05 PM, John Omernik wrote: > I have a table in Hive (no partitions, single level, stored as P

Parquet File Weirdness

2015-04-03 Thread John Omernik
I have a table in Hive (no partitions, single level, stored as PARQUET (hive-0.13). When I query it in hive, it works fine, when I run a count(*) on it drill it works (fast) but when I run a query, it seems to return the same number of results, but it look likes this... thoughts? (These should b

Re: Nested or Array JSON

2015-04-03 Thread Kristine Hahn
Here are some examples of queries on the actual data you are using: This query extracts some data from the "meta" map. select t.meta.`view`.`id` from dfs.`/Users/khahn/Documents/test_files_source/opendata.json` t; > ++ > > | EXPR$0 | > > ++ > > | n2rk-fwkj | > > +---

Re: Nested or Array JSON

2015-04-03 Thread Jason Altekruse
The link Kristine posted gave me a 404, here is the corrected link hosted on the apache server. http://drill.apache.org/docs/json-data-model/#handling-type-differences On Fri, Apr 3, 2015 at 7:49 AM, Kristine Hahn wrote: > You solve the "Needed to be in state INIT or IN_VARCHAR but in mode > IN

Re: TSV with a JSON column?

2015-04-03 Thread Kristine Hahn
Link correction: http://apache.github.io/drill/docs/json-data-model#handling-type-differences Kristine Hahn Sr. Technical Writer 415-497-8107 @krishahn On Fri, Apr 3, 2015 at 6:57 AM, Vince Gonzalez wrote: > Yep: > > 0: jdbc:drill:zk=local> select jblob['v'] from (select > convert_from(columns

Re: Nested or Array JSON

2015-04-03 Thread Kristine Hahn
You solve the "Needed to be in state INIT or IN_VARCHAR but in mode IN_BIGINT" by using all_text_mode to resolve the schema differences, as described in http://apache.github.io/drill/docs/json-data-model/handling-type-differences. On my jdbc connection, for example: > select * from dfs.`/Users/ope

Re: TSV with a JSON column?

2015-04-03 Thread Jinfeng Ni
That's right. columns[1] from tsv file is regarded as a VARCHAR column. You have to use convert_from function. 0: jdbc:drill:zk=local> select t.colA.v from (select convert_from(columns[1], 'JSON') from `AncestrySample100.tsv` limit 1) as t(colA); ++ | EXPR$0 | ++ | 3.0

Re: TSV with a JSON column?

2015-04-03 Thread Vince Gonzalez
Yep: 0: jdbc:drill:zk=local> select jblob['v'] from (select convert_from(columns[1], 'JSON') as jblob from `AncestrySample100.tsv`) limit 1; ++ | EXPR$0 | ++ | 3.0| ++ 1 row selected (0.136 seconds) Thanks Carol! On Fri, Apr 3, 2015 at 9:28 AM, Ca

Re: TSV with a JSON column?

2015-04-03 Thread Carol McDonald
maybe something like select convert_from(t.columns[1], 'JSON') from AncestrySample100.tsv t On Fri, Apr 3, 2015 at 9:06 AM, Vince Gonzalez wrote: > Can I tell Drill to parse the JSON in a column of a TSV? > > cd /tmp > curl -L --output AncestrySample100.tsv > > https://raw.githubusercontent.com

TSV with a JSON column?

2015-04-03 Thread Vince Gonzalez
Can I tell Drill to parse the JSON in a column of a TSV? cd /tmp curl -L --output AncestrySample100.tsv https://raw.githubusercontent.com/ThinkBigAnalytics/ThinkBigChallenge2014/master/data/AncestrySample100 ... 0: jdbc:drill:zk=local> use dfs.tmp; 0: jdbc:drill:zk=local> select columns[1] from

Re: Nested or Array JSON

2015-04-03 Thread Muthu Pandi
Tried with the Flatten but the result is same , Kindly help with pointers "ERROR [HY000] [MapR][Drill] (1040) Drill failed to execute the query: SELECT * FROM `HDFS`.`root`.`./user/hadoop2/unclaimedaccount.json` LIMIT 100 [30024]Query execution error. Details:[ Query stopped., Needed to be in stat

Re: Problem with hbase query

2015-04-03 Thread Aditya
This 4th line should be "hbase.zookeeper.quorum": "10.10.100.62,10.10.10.52", without the angular brackets around the Zookeeper quorum. On Thu, Apr 2, 2015 at 9:44 PM, Mahesh Sankaran wrote: > Hi Andries and Aditya, > >I have queried specific CF from hbase table.Still it shows