Re: query from hbase issue

2016-05-23 Thread qiang li
I upgrade to latest cdh version which is HBase 1.2.0-cdh5.7.0 and test it again, The result are correct now. Thanks for the help. Even though, I think the query plan still can be optimized. Here is what I think can improve: a. specify the columns when need b. remove the rowFilter when scan

Re: "user" as a reserved word

2016-05-23 Thread Jinfeng Ni
mydb=# select "user" from t1; user -- ABC I should take back what I said. With quoted identifier, Postgres behaved different from Drill. Both of the interpretations seem to be reasonable, since the identifier could represent two different things. On Mon, May 23, 2016 at 7:41 PM, Zelaine

Re: "user" as a reserved word

2016-05-23 Thread Zelaine Fong
Jinfeng, What does postgres return for the following query in your example? select "user" from t1; -- Zelaine On Mon, May 23, 2016 at 7:39 PM, John Omernik wrote: > Hmm, you are correct, I don't have to like it :) but there is both logic > and precedence here. Thanks for

Re: query from hbase issue

2016-05-23 Thread qiang li
Yes, Its seems like the same issue. I will upgrade it and test again. But do you think we can update the physical plan too. If we only want to query one qualifier, then the columns of the plan should only contains the qualifier instead of "*". Maybe this plan will be query fast. Am I right?

Re: "user" as a reserved word

2016-05-23 Thread Jinfeng Ni
An quoted identifier is still an identifier (Drill uses back tick as quote). Per SQL standard, identifier CURRENT_USER / USER/ CURRENT_SESSION/etc are implicit function calls; no () is required. I checked Postgre, and seems it has the same behavior. mydb=# create table t1 (id int, "user"

Re: "user" as a reserved word

2016-05-23 Thread John Omernik
Can (should) things inside back ticks be callable? I guess this makes a very difficult situation from a usability standpoint because user is a not uncommon column name (think security logs, web logs, etc) yet in the current setup there is lots of possibility for assumptions on calling back tick

Re: Hangout Frequency

2016-05-23 Thread Parth Chandra
The overwhelming response (?!) seems to have been to agree to have the hangout every other week. So the next hangout will be Tuesday 5/31. See you all then. On Fri, May 20, 2016 at 7:34 PM, Aman Sinha wrote: > Every other week sounds good to me. It is a substantial

Re: CORS for Apache Drill

2016-05-23 Thread Hanifi GUNES
Great. CrossOriginFilter ships with jetty-servlets package so you will need to import it in exec pom file at [1]. Also you may enjoy additional community support if you target dev@ list for your code/implementation related questions. Let me know. -Hanifi 1:

Re: "user" as a reserved word

2016-05-23 Thread Jinfeng Ni
The problem here is that identifier 'user' is not only a reserved word, but also represents a special function == current_user() call. The identifier 'user', whether it's quoted or not, could mean either column name or the function call. Without the table alias, it could be ambiguous to sql

Re: Reading and converting Parquet files intended for Impala

2016-05-23 Thread John Omernik
Troubleshooting this is made more difficult by the fact that the file that gives the error works fine when I select directly from it into a new table... this makes it very tricky to troubleshoot, any assistance on this would be appreciated, I've opened a ticket with MapR as well, but I am stumped,

Re: Issue with Queries Hanging

2016-05-23 Thread John Omernik
Distributed. (MapR FS, but via NFS) On Mon, May 23, 2016 at 3:26 PM, Abdel Hakim Deneche wrote: > One question about the missing query profile: do you store the query > profiles in the local file system or the distributed file system ? > > On Mon, May 23, 2016 at 9:31

Re: Issue with Queries Hanging

2016-05-23 Thread Abdel Hakim Deneche
One question about the missing query profile: do you store the query profiles in the local file system or the distributed file system ? On Mon, May 23, 2016 at 9:31 AM, John Omernik wrote: > Hey all, this is separate, yet related issue to my other posts RE Parquet, > however,

Re: "user" as a reserved word

2016-05-23 Thread John Omernik
Ya, as I am testing, this works, however, the users of the system expect to be able to use `user` and while I can provide them instructions to use a table alias, I am very worried that they will forget and since it doesn't error, but instead puts in a different string, this could lead to bad

Re: "user" as a reserved word

2016-05-23 Thread John Omernik
I filed https://issues.apache.org/jira/browse/DRILL-4692 I see an alias would work as a tmp fix, but this should be address (I wonder if other words may have a problem too?) On Mon, May 23, 2016 at 12:38 PM, Andries Engelbrecht < aengelbre...@maprtech.com> wrote: > Hmm interesting. > > As a

"user" as a reserved word

2016-05-23 Thread John Omernik
I have data with a field name user. When I select, with backticks, it doesn't show the field, but instead my current logged in user... select CONVERT_FROM(`user`, 'UTF8') as `user` from table limit 10; Shouldn't the backticks allow me to reference the field properly? John

Re: Issue with Queries Hanging

2016-05-23 Thread John Omernik
Note: I did see after letting one just hang for a long time, a message about HEAP space... I was running with 12GB of Heap and 96 GB of Direct, I switched to 24 GB of Heap and 84 GB of Direct, and now my queries fail, but all with the index out of bounds issue, and then my drill bits stay

Issue with Queries Hanging

2016-05-23 Thread John Omernik
Hey all, this is separate, yet related issue to my other posts RE Parquet, however, I thought I'd post this to see if this is normal or should be handled (and/or JIRAed) I am running Drill 1.6, if you've read the other posts, I am trying to CTAS a large amount of data (largish) 120 GB from

DATA_READ ERROR: Error processing input: Cannot use newline character within quoted string

2016-05-23 Thread Wilburn, Scott
Hello Drill community, I'm seeing this error with Drill 1.5 when querying CSV or TSV data that contains quotes within some of the fields. I have a couple questions about this. 1. Does anyone know why Drill cares about quotes in the data? 2. Is there a workaround to this problem, besides

Re: query from hbase issue

2016-05-23 Thread Krystal Nguyen
Hi Qiang, Looks like you might be encountering this issue: https://issues.apache.org/jira/browse/DRILL-4271 Thanks On Sun, May 22, 2016 at 8:38 PM, qiang li wrote: > I test it step by step again. And finally I find out that the issue > happened only if the qualifier

Re: Performance tuning for TPC-H Q1 on a three nodes cluster

2016-05-23 Thread Dechang Gu
Hi Yijie, This is Dechang at MapR. I work on Drill performance. >From what you described, looks like scan took most of the time. How are the files are distributed on the disks, are there any skew? How many disks are there? If possible can you provide the profile for the run? Thanks, Dechang On

Re: Reading Parquet Files Created Elsewhere

2016-05-23 Thread John Omernik
That did work faster, I can now get the 10 rows in 12 seconds as opposed to 25. So in my 25 sec. query, I CAST all items from the parquet, but do I need to that? for the 12 seconds query, I only CONVERT_FROM on the string values, the view seems happier. So that's nice. Thanks for the point, I am

Re: Reading Parquet Files Created Elsewhere

2016-05-23 Thread Andries Engelbrecht
John, See if convert_from helps in this regard, I believe it is supposed to be faster than cast varchar. This is likely what will work on your data CONVERT_FROM(, 'UTF8') Hopefully someone with more in depth knowledge of the Drill Parquet reader can comment. --Andries > On May 23, 2016,

Reading and converting Parquet files intended for Impala

2016-05-23 Thread John Omernik
I have a largish directory of parquet files generated for use in Impala. They were created with the CDH version of apache-parquet-mr (not sure on version at this time) Some settings: Compression: snappy Use Dictionary: true WRITER_VERION: PARQUET_1_0 I can read them as is in Drill, however, the

Re: Reading Parquet Files Created Elsewhere

2016-05-23 Thread John Omernik
I am learning more about my data here, the data was created in a CDH version of the apache parquet-mr library. (Not sure version yet, getting that soon). They used snappy and version 1.0 of the Parquet spec due to Impala needing it. They are also using setEnableDictionary on the write. Trying

Re:Reading Parquet Files Created Elsewhere

2016-05-23 Thread Todd
Looks like Impala encoded string as binary data, I think there is some configuration in Drill(I know spark has) that helps do the conversion. At 2016-05-23 21:25:17, "John Omernik" wrote: >Hey all, I have some Parquet files that I believe were made in a Map Reduce >job

Reading Parquet Files Created Elsewhere

2016-05-23 Thread John Omernik
Hey all, I have some Parquet files that I believe were made in a Map Reduce job and work well in Impala, however, when I read them in Drill, the fields that are strings come through as [B@25ddbb etc. The exact string represented as regex would be /\[B@[a-f0-9]{8}/ (Pointers maybe?) Well, I found

Re: Regarding Excel Files and Ms Access File.

2016-05-23 Thread Khurram Faraaz
Sanjiv you can try to export your data from MS Excel file to a text/CSV file, using the Save As option. and then query that CSV/text file from Drill. On Thu, May 19, 2016 at 12:38 AM, Antonio Romero (carnorom) < carno...@cisco.com> wrote: > http://ucanaccess.sourceforge.net/site.html > > This

Re: Drill Issues

2016-05-23 Thread Khurram Faraaz
You can use CONVERT_TO and CONVERT_FROM functions, to get HDFS bytes read/write from drill queries. https://drill.apache.org/docs/supported-data-types/#data-types-for-convert_to-and-convert_from-functions On Mon, May 23, 2016 at 11:28 AM, vinita.go...@siemens.com < vinita.go...@siemens.com>

Re: Drill Issues

2016-05-23 Thread Tushar Pathare
Hello Vinita, A simple storage plugins for Hive looks like this. { "type": "hive", "enabled": true, "configProps": { "hive.metastore.uris": "thrift://scflexnode09:9083", "hive.metastore.sasl.enabled": "false", "fs.default.name": "hdfs://scflexnode09:8020/" } }

RE: Drill Issues

2016-05-23 Thread vinita.go...@siemens.com
Hi.. How can I use Hive tables for running drill queries. What will be the storage plug-in? Please help I tried it a lot But could not get how to do. Thanks and Regards Vinita Goyal