I upgrade to latest cdh version which is HBase 1.2.0-cdh5.7.0 and test it
again, The result are correct now. Thanks for the help.
Even though, I think the query plan still can be optimized.
Here is what I think can improve:
a. specify the columns when need
b. remove the rowFilter when scan
mydb=# select "user" from t1;
user
--
ABC
I should take back what I said. With quoted identifier, Postgres
behaved different from Drill. Both of the interpretations seem to be
reasonable, since the identifier could represent two different things.
On Mon, May 23, 2016 at 7:41 PM, Zelaine
Jinfeng,
What does postgres return for the following query in your example?
select "user" from t1;
-- Zelaine
On Mon, May 23, 2016 at 7:39 PM, John Omernik wrote:
> Hmm, you are correct, I don't have to like it :) but there is both logic
> and precedence here. Thanks for
Yes, Its seems like the same issue. I will upgrade it and test again. But
do you think we can update the physical plan too. If we only want to query
one qualifier, then the columns of the plan should only contains the
qualifier instead of "*". Maybe this plan will be query fast. Am I right?
An quoted identifier is still an identifier (Drill uses back tick as
quote). Per SQL standard, identifier CURRENT_USER / USER/
CURRENT_SESSION/etc are implicit function calls; no () is required.
I checked Postgre, and seems it has the same behavior.
mydb=# create table t1 (id int, "user"
Can (should) things inside back ticks be callable? I guess this makes a
very difficult situation from a usability standpoint because user is a not
uncommon column name (think security logs, web logs, etc) yet in the
current setup there is lots of possibility for assumptions on calling back
tick
The overwhelming response (?!) seems to have been to agree to have the
hangout every other week.
So the next hangout will be Tuesday 5/31.
See you all then.
On Fri, May 20, 2016 at 7:34 PM, Aman Sinha wrote:
> Every other week sounds good to me. It is a substantial
Great. CrossOriginFilter ships with jetty-servlets package so you will need
to import it in exec pom file at [1].
Also you may enjoy additional community support if you target dev@ list for
your code/implementation related questions.
Let me know.
-Hanifi
1:
The problem here is that identifier 'user' is not only a reserved
word, but also represents a special function == current_user() call.
The identifier 'user', whether it's quoted or not, could mean either
column name or the function call. Without the table alias, it could
be ambiguous to sql
Troubleshooting this is made more difficult by the fact that the file that
gives the error works fine when I select directly from it into a new
table... this makes it very tricky to troubleshoot, any assistance on this
would be appreciated, I've opened a ticket with MapR as well, but I am
stumped,
Distributed. (MapR FS, but via NFS)
On Mon, May 23, 2016 at 3:26 PM, Abdel Hakim Deneche
wrote:
> One question about the missing query profile: do you store the query
> profiles in the local file system or the distributed file system ?
>
> On Mon, May 23, 2016 at 9:31
One question about the missing query profile: do you store the query
profiles in the local file system or the distributed file system ?
On Mon, May 23, 2016 at 9:31 AM, John Omernik wrote:
> Hey all, this is separate, yet related issue to my other posts RE Parquet,
> however,
Ya, as I am testing, this works, however, the users of the system expect to
be able to use `user` and while I can provide them instructions to use a
table alias, I am very worried that they will forget and since it doesn't
error, but instead puts in a different string, this could lead to bad
I filed https://issues.apache.org/jira/browse/DRILL-4692
I see an alias would work as a tmp fix, but this should be address (I
wonder if other words may have a problem too?)
On Mon, May 23, 2016 at 12:38 PM, Andries Engelbrecht <
aengelbre...@maprtech.com> wrote:
> Hmm interesting.
>
> As a
I have data with a field name user.
When I select, with backticks, it doesn't show the field, but instead my
current logged in user...
select CONVERT_FROM(`user`, 'UTF8') as `user` from table limit 10;
Shouldn't the backticks allow me to reference the field properly?
John
Note: I did see after letting one just hang for a long time, a message
about HEAP space... I was running with 12GB of Heap and 96 GB of Direct, I
switched to 24 GB of Heap and 84 GB of Direct, and now my queries fail, but
all with the index out of bounds issue, and then my drill bits stay
Hey all, this is separate, yet related issue to my other posts RE Parquet,
however, I thought I'd post this to see if this is normal or should be
handled (and/or JIRAed)
I am running Drill 1.6, if you've read the other posts, I am trying to CTAS
a large amount of data (largish) 120 GB from
Hello Drill community,
I'm seeing this error with Drill 1.5 when querying CSV or TSV data that
contains quotes within some of the fields. I have a couple questions about
this.
1. Does anyone know why Drill cares about quotes in the data?
2. Is there a workaround to this problem, besides
Hi Qiang,
Looks like you might be encountering this issue:
https://issues.apache.org/jira/browse/DRILL-4271
Thanks
On Sun, May 22, 2016 at 8:38 PM, qiang li wrote:
> I test it step by step again. And finally I find out that the issue
> happened only if the qualifier
Hi Yijie,
This is Dechang at MapR. I work on Drill performance.
>From what you described, looks like scan took most of the time.
How are the files are distributed on the disks, are there any skew?
How many disks are there?
If possible can you provide the profile for the run?
Thanks,
Dechang
On
That did work faster, I can now get the 10 rows in 12 seconds as opposed to
25.
So in my 25 sec. query, I CAST all items from the parquet, but do I need to
that? for the 12 seconds query, I only CONVERT_FROM on the string values,
the view seems happier. So that's nice.
Thanks for the point, I am
John,
See if convert_from helps in this regard, I believe it is supposed to be faster
than cast varchar.
This is likely what will work on your data
CONVERT_FROM(, 'UTF8')
Hopefully someone with more in depth knowledge of the Drill Parquet reader can
comment.
--Andries
> On May 23, 2016,
I have a largish directory of parquet files generated for use in Impala.
They were created with the CDH version of apache-parquet-mr (not sure on
version at this time)
Some settings:
Compression: snappy
Use Dictionary: true
WRITER_VERION: PARQUET_1_0
I can read them as is in Drill, however, the
I am learning more about my data here, the data was created in a CDH
version of the apache parquet-mr library. (Not sure version yet, getting
that soon). They used snappy and version 1.0 of the Parquet spec due to
Impala needing it. They are also using setEnableDictionary on the write.
Trying
Looks like Impala encoded string as binary data, I think there is some
configuration in Drill(I know spark has) that helps do the conversion.
At 2016-05-23 21:25:17, "John Omernik" wrote:
>Hey all, I have some Parquet files that I believe were made in a Map Reduce
>job
Hey all, I have some Parquet files that I believe were made in a Map Reduce
job and work well in Impala, however, when I read them in Drill, the fields
that are strings come through as [B@25ddbb etc. The exact string
represented as regex would be /\[B@[a-f0-9]{8}/ (Pointers maybe?)
Well, I found
Sanjiv you can try to export your data from MS Excel file to a text/CSV
file, using the Save As option. and then query that CSV/text file from
Drill.
On Thu, May 19, 2016 at 12:38 AM, Antonio Romero (carnorom) <
carno...@cisco.com> wrote:
> http://ucanaccess.sourceforge.net/site.html
>
> This
You can use CONVERT_TO and CONVERT_FROM functions, to get HDFS bytes
read/write from drill queries.
https://drill.apache.org/docs/supported-data-types/#data-types-for-convert_to-and-convert_from-functions
On Mon, May 23, 2016 at 11:28 AM, vinita.go...@siemens.com <
vinita.go...@siemens.com>
Hello Vinita,
A simple storage plugins for Hive looks like this.
{
"type": "hive",
"enabled": true,
"configProps": {
"hive.metastore.uris": "thrift://scflexnode09:9083",
"hive.metastore.sasl.enabled": "false",
"fs.default.name": "hdfs://scflexnode09:8020/"
}
}
Hi..
How can I use Hive tables for running drill queries. What will be the storage
plug-in? Please help I tried it a lot But could not get how to do.
Thanks and Regards
Vinita Goyal
30 matches
Mail list logo