I am not familiar with the Spark terminology, but I think you more or less
have it correct.
You example query does not necessarily involve a hash-exchange (roughly the
equivalent to a shuffle), because it's possible to run the entire execution
in a single fragment. In this case, it would probably
Yes, it possible to run an embedded drillbit in an application. A simple
example is this tool:
https://github.com/apache/drill/blob/master/exec/java-exec/src/main/java/org/apache/drill/exec/client/QuerySubmitter.java
If the local option is specified, it will start up one or more drillbits.
On We
t.batter.id from (select flatten(batters.batter)) as batter from
`sample.json`) t;
On Thu, Mar 10, 2016 at 6:25 PM, Steven Phillips wrote:
> Yeah, it's definitely a bug. Could you please file a jira?
>
> On Thu, Mar 10, 2016 at 6:19 PM, Jiang Wu wrote:
>
>> Here are the complete
ode simply picks the
> first N values from the union of all values across all rows. The N is the
> number of rows in the result.
>
> For example, if I give this query:
>
> 0: jdbc:drill:zk=local> select id, t.batters.batter.id from
> dfs.`c:\tmp\sample.json` t;
> +---
I am surprised that you are getting that result. I would have expected the
query to fail. Since batter is an array, you should specify the index of
the array if yo want to access lower level elements.
A way to access all of the sub-fields of a repeated map is something we've
discussed, but never i
This parameter is under the sort namespace. It applies to the TopN sort
operator. While computing the TopN, we hold onto incoming batches and
maintain a list of references to the records which make up the current
TopN. Periodically, we will copy the records that we want to keep, and
release the bat
I don't understand why they wouldn't be allowed. They seem perfectly valid.
On Thu, Feb 11, 2016 at 9:42 AM, Abdel Hakim Deneche
wrote:
> I have the following table tpch100/lineitem that contains 97 parquet files:
>
> tpch100/lineitem/part-m-0.parquet
> tpch100/lineitem/part-m-1.parquet
Hi Burton,
Like John said, I would strongly recommend using the same configuration for
each drillbit.
The reason memory settings are in drill-env is that we use the standard
java options to limit the amount of memory the jvm can use.
drill-override.conf contains Drill specific settings.
Let me k
You can file a jira at https://issues.apache.org/jira/browse/DRILL/
When you file, go ahead and assign it to me.
On Wed, Dec 2, 2015 at 1:41 PM, Jacques Nadeau wrote:
> Steven, can you look at this?
>
> --
> Jacques Nadeau
> CTO and Co-Founder, Dremio
>
> On Wed, Dec 2, 2015 at 10:05 AM, John S
Could you elaborate a bit on what it is you are trying to do, as well as
what you have tried and what result you saw?
Thanks.
On Tue, Nov 17, 2015 at 8:43 AM, Boris Chmiel <
boris.chm...@yahoo.com.invalid> wrote:
> hello users,
> I'm trying to vectorize a text field of a CSV file and planning to
This looks like DRILL-4006, a fix for which just went in.
https://issues.apache.org/jira/browse/DRILL-4006
On Wed, Nov 4, 2015 at 12:16 PM, John Omernik wrote:
> I am on MapR's 1.2.1 Package.
>
>
>
>
> On Wed, Nov 4, 2015 at 2:14 PM, Abdel Hakim Deneche >
> wrote:
>
> > One last thing, what v
The "table with options" feature is currently in the works, and it will
address this use case.
Currently, the only way to do this is to create two different workspaces,
each with a different default format.
On Sun, Nov 1, 2015 at 1:45 PM, William Witt
wrote:
> Is there a way to override the del
1. You might be able to run a query against OpenTSDB, but I'm not sure if
you will really be able to easily do anything useful right now. Every
column qualifier in an HBase table results in a column in Drill. In the
OpenTSDB format, the column qualifiers are simply time offsets from the
base timest
DRILL-2424 has a comment from Mehant that this should be fixed, but that
there was some sort of merge conflict. Was this ever resolved? Or a new
jira filed?
On Tue, Oct 13, 2015 at 10:13 AM, Rajkumar Singh
wrote:
> There is related jira already filed
> https://issues.apache.org/jira/browse/DRILL
convert_to is the correct function in this case. convert_to converts the
Drill type into some encoding. The output of the convert_to function is
VarBinary. Can you try wrapping cast( ... as varchar(255)) and see if that
displays it correctly?
On Mon, Oct 12, 2015 at 1:37 PM, John Omernik wrote:
In answer to the other part of your question, yes, by default, each
fragment will write into its own set of files, you could be looking at (#
unique values) * (number of fragments) files being created. There is an
option to shuffle the data before writing, so that each value will be
written by only
same name.
On Sun, Oct 4, 2015 at 3:34 PM, Stefán Baxter
wrote:
> Hi,
>
> For me the wild card functionality is fine and functions as expected.
> It's partly because of it that I expected an exact match when no operator
> was in play.
>
> Regards,
> -Stefan
>
> O
Repated_contains originally worked as Jason describes, exact matching. At
some point, someone thought that it should allow wildcards and do substring
matching. There was never any real discussion on what this function should
do, though. It would probably be a good idea for someone to come up with a
You need to change s3 to s3n in the URI:
See the discussion in the comments of this blog post:
http://drill.apache.org/blog/2014/12/09/running-sql-queries-on-amazon-s3/
Hopefully that helps. Let me know if you are still having problems.
On Tue, Sep 22, 2015 at 8:47 AM, Andries Engelbrecht <
aen
The errors like: "java.io.FileNotFoundException: Path is not a file:
/warehouse2/completed/events/connection_events/1441290600"
are really just noise, and aren't related to the failure. We should
probably clean them up so that we aren't attempting to open directories,
but they are not causing the q
uot;:2},{"array":3}]} |
> +----+
> 1 row selected (0.335 seconds)
>
>
> Is there any trick of reading the list properly in Drill?
>
> Thanks,
> Hao
>
>
>
>
>
> On Fri, Aug 28, 2015 at 4:20 PM, Steven Phillips
&
Both parquet and drill internal data model is based on protobuf, meaning
there are required, optional, and repeated fields. In this model, repeated
fields cannot be null, nor can they have null elements. The 3-layer nested
structure is necessary to represent a field where the array itself is
nullab
+-+
> > |type| EXPR$1 |
> > ++-+
> > | plan.item.removed | 947 |
> > | plan.item.added| 40342 |
> > ++-+
> > 2 rows selected (0.508 seconds)
> >
> >
> > *2. Same query but involves dimension.type as well*
> >
> > select p.type, coalesce(p.dimensions.dim_type, p.dimensions.type)
> > dimensions_type, count(*) from
> > dfs.tmp.`/analytics/processed//events` as p where
> occurred_at
> > > '2015-07-26' and p.type in ('plan.item.added','plan.item.removed')
> group
> > by p.type, coalesce(p.dimensions.dim_type, p.dimensions.type);
> >
> > Error: SYSTEM ERROR: NumberFormatException: To See
> > Fragment 2:0
> > [Error Id: 4756f549-cc47-43e5-899e-10a11efb60ea on localhost:31010]
> > (state=,code=0)
> >
> >
> > I can provide test data if this is not enough to reproduce this bug.
> >
> > Regards,
> > -Stefán
> >
>
--
Steven Phillips
Software Engineer
mapr.com
; 00-07 Filter(condition=[AND(=($0, 1992-01-01), =($1,
> 1992-01-01))])
> 00-08Project(l_moddate=[$2], l_shipdate=[$1],
> l_modline=[$0])
> 00-09 Scan.
>
> - Rahul
>
--
Steven Phillips
Software Engineer
mapr.com
this e-mail is
> strictly prohibited. If you have received this e-mail in error, please
> immediately inform us by returning e-mail, and thereafter, proceed to
> permanently delete the entire e-mail sent in error. Thank you.
>
>
>
--
Steven Phillips
Software Engineer
mapr.com
post which describes the option well enough
>
> http://mail-archives.apache.org/mod_mbox/drill-commits/201506.mbox/%3c38571170b14d484bba843f1a513be...@git.apache.org%3E
>
--
Steven Phillips
Software Engineer
mapr.com
:
> Hi All,
>
> Wanted to ask if there is dependency on number of zookeeper instances.
>
> For e.g. if I am running 25 nodes for drill bits - will 1 zookeeper
> suffice.
>
> Also, any performance tips for reading data from S3.
>
> Regards,
> Sarvesh
>
> --
&g
tamp(1432912733))
> > > from `sys`.`version`;
> > >
> > > Error: SYSTEM ERROR: java.lang.IllegalArgumentException: Invalid
> format:
> > > "2015-05-29 15:18:53.000" is malformed at ".000”
> > >
> > > —Andries
> > >
> > >
> > >
> > > On Jun 15, 2015, at 7:18 AM, Christopher Matta
> wrote:
> > >
> > > > Is there a way to convert a timestamp string to unix time?
> > > >
> > > > Chris Matta
> > > > cma...@mapr.com
> > > > 215-701-3146
> > >
> > >
> >
>
--
Steven Phillips
Software Engineer
mapr.com
emand Training
> <
> http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available
> >
>
--
Steven Phillips
Software Engineer
mapr.com
out is also quite a likely
> possibility.
>
> Seeing your code (put it in a gist, don't attach it) would help a lot.
> Seeing queries and query plans would help as well.
>
--
Steven Phillips
Software Engineer
mapr.com
like to ask for
> help
> > how to achieve that? How to make Drill run queries in low-latency (in
> > seconds not minutes)?
> >
> > Any suggestions are welcome!
> >
> > Thanks!
> >
> > George
> >
>
--
Steven Phillips
Software Engineer
mapr.com
it drill-env.sh, located in
> /opt/drill/conf/, and define HADOOP_HOME:"
>
> What is external JAR files? What is the purpose if I set the HADOOP_HOME?
>
> Thanks!
>
> George Lu
>
--
Steven Phillips
Software Engineer
mapr.com
;: "xyz"
> > }
> > }
> > ]
> > }
> > When i query
> > select t.company.`modelName` from hdfs.`autom.json` t ;
> > it gives result
> > {"name":"abc"}
> > However, The expected result was both entries.
> > {"name":"abc"}
> > {"name":"xyz"}
> > Even when I query
> > select t.company.`modelName` from hdfs.`autom.json` t where
> > t.company.`modelName`.`name`='xyz' ;
> > it does not find anything.
> >
> >
> >
>
--
Steven Phillips
Software Engineer
mapr.com
> > +++
> > |dir0|col2|
> > +++
> > | folder1 | 2 |
> > | null | null |
> > +++
> >
> > Looks like drill ignored the columns from the first file.
> >
> > - Rahul
> >
>
--
Steven Phillips
Software Engineer
mapr.com
face
> similar to, for example, that implemented in Impala? The scenario I'm
> evaluating involves approximate unique value counting with hyperloglog,
> which would benefit from the ability to perform the counting locally by
> each drillbit folowed by a hyperloglog state merge f
`sometable`);
>
> All resulting files are with size 10M(Same as parquet block size).
>
> My question is:
> Is there any way to create a parquet file with multiple parquet blocks?
>
> Thanks,
> Hao
>
--
Steven Phillips
Software Engineer
mapr.com
keys to decrypt the data are custom
> >> controlled. Is there a way I can use drill with this data given that I
> have
> >> a java module that can be called that will provide the master key to
> >> decrypt the data on the fly?
> >> > > My situation: A lot of the use cases that we have might work well
> >> with the new approach of S3 client-side encryption, but for using drill
> to
> >> explore that data. So any pointers/help here will be much appreciated.
> >> > > Thanks!
> >> > > -Ganesh
> >> >
> >>
> >>
> >
> >
>
--
Steven Phillips
Software Engineer
mapr.com
When I query it in hive, it works fine, when I run a
> > count(*) on it drill it works (fast) but when I run a query, it seems
> > to return the same number of results, but it look likes this...
> > thoughts? (These should be strings with emails, domains, etc)
> >
> >
0: jdbc:drill:zk=localhost:2181> select * from tstamp_test limit 1;
> >>>>> ++
> >>>>> | t |
> >>>>> ++
> >>>>> | 2015-01-27T13:43:53.000Z |
> >>>>> ++
> >>>>> 1 row selected (0.119 seconds)
> >>>>>
> >>>>> The below queries, identical apart from the limit clause, behave
> >>>>> differently. The one with the limit clause works, the one without
> >>>> doesn't.
> >>>>> The limit is larger than the total number of rows, so in both cases
> we
> >>>>> should be processing all rows.
> >>>>>
> >>>>> No limit clause. It fails:
> >>>>>
> >>>>> ```
> >>>>> 0: jdbc:drill:zk=localhost:2181> select to_timestamp(t.t,
> >>>>> '-MM-dd''T''HH:mm:ss.SSS''Z''') FROM (select t from tstamp_test)
> as
> >>>> t;
> >>>>> Query failed: RemoteRpcException: Failure while trying to start
> remote
> >>>>> fragment, Expression has syntax error! line 1:30:mismatched input 'T'
> >>>>> expecting CParen [ 7d30d753-0822-4820-afd0-b7e7fe5e639c on
> >>>>> 192.168.99.1:31010 ]
> >>>>> ```
> >>>>>
> >>>>> Limit clause in the subselect (larger than the number of rows in the
> >>>> table)
> >>>>> succeeds.
> >>>>>
> >>>>> ```
> >>>>> 0: jdbc:drill:zk=localhost:2181> select to_timestamp(t.t,
> >>>>> '-MM-dd''T''HH:mm:ss.SSS''Z''') FROM (select t from tstamp_test
> limit
> >>>>> 1) as t;
> >>>>> ...
> >>>>> | 2015-02-17 07:18:00.0 |
> >>>>> ++
> >>>>> 13,015,350 rows selected (105.257 seconds)
> >>>>> ```
> >>>>>
> >>>>> Data can be downloaded here:
> >>>>>
> >>>>> https://s3.amazonaws.com/vgonzalez/data/tstamp_test.tar.gz
> >>>>
> >>>>
> >>
> >
>
>
--
Steven Phillips
Software Engineer
mapr.com
as it reads the CSV in
> such cases? I am trying to use Drill for data exploration purposes and
> mostly to get a peek into the data set from my data lake before running
> bigger queries/analytics on this data set.
>
> Regards,
> Ganesh
>
--
Steven Phillips
Software Engineer
mapr.com
nicely. The minute
> we start the drillbit on that node again, it starts swamping it with work.
>
> I'll shoot through the JSON profiles and some more information on the
> dataset etc. later today (Australian time!).
>
> On Thu, Mar 26, 2015 at 5:31 AM, Steven Phillips
>
;s point, the node that the client connects to is not currently
> randomized. Given your description of behavior, I'm not sure that you're
> hitting 2512 or just general undesirable distribution.
>
> On Wed, Mar 25, 2015 at 10:18 AM, Steven Phillips
> wrote:
>
> >
t this essentially swamping this data node with 100% CPU
> usage
> > while leaving the others barely doing any work.
> >
> > As soon as we shut down the Drillbit on this data node, query performance
> > increases significantly.
> >
> > Any thoughts on how I can troubleshoot why Drill is picking that
> particular
> > node?
> >
>
--
Steven Phillips
Software Engineer
mapr.com
know that Drill tries to get data locality, so I'm wondering if this is
> > the cause, but this essentially swamping this data node with 100% CPU
> usage
> > while leaving the others barely doing any work.
> >
> > As soon as we shut down the Drillbit on this data node, query performance
> > increases significantly.
> >
> > Any thoughts on how I can troubleshoot why Drill is picking that
> particular
> > node?
>
>
--
Steven Phillips
Software Engineer
mapr.com
_replace on the text field I get an error for some files.
> (Encountered an illegal char on line 1, column 38: ‘’)
>
> Thanks
> —Andries
--
Steven Phillips
Software Engineer
mapr.com
s decimal(5,2)) , cast(d.map.col2 as double)) from
> `data.json`;
>
>
> I am looking for something on the below lines :
>
> select cast(m1 as map(col1:decimal, col2:double)) from `data.json`;
>
> - Rahul
>
--
Steven Phillips
Software Engineer
mapr.com
dissemination, or reproduction of this
> message is strictly prohibited and may be unlawful. If you are not the
> intended recipient, please contact the sender by return e-mail and destroy
> all copies of the original message.
>
>
--
Steven Phillips
Software Engineer
mapr.com
nM=","4007774":"FUM="} |
> > > +++
> > > 5 rows selected (0.766 seconds)
> > > 0: jdbc:drill:> select convert_from(row_key, 'UTF8') as tid,
> > kvgen(t.price)
> > > as price from dfs.`/tables/trades_flat` t limit 5;
> > > +++
> > > |tid | price|
> > > +++
> > > | AMZN_2013102107 | [{"key":"3901713","value":"JUA="}] |
> > > | AMZN_2013102108 | [{"key":"4600159","value":"E6o="}] |
> > > | AMZN_2013102109 |
> > >
> > >
> >
> [{"key":"3136026","value":"HL4="},{"key":"3448092","value":"JjI="},{"key":"3926121","value":"Hq0="}]
> > > |
> > > | AMZN_2013102111 |
> > >
> > >
> >
> [{"key":"1149689","value":"Iuo="},{"key":"3023456","value":"HRs="}]
> > > |
> > > | AMZN_2013102112 |
> > >
> > >
> >
> [{"key":"0705787","value":"InM="},{"key":"4007774","value":"FUM="}]
> > > |
> > > +++
> > >
> >
>
--
Steven Phillips
Software Engineer
mapr.com
What exactly was the result. I would expect it would implicitly cast the
string to a date for the comparison.
On Thu, Feb 12, 2015 at 2:38 PM, Minnow Noir wrote:
> Yes.
> On Feb 12, 2015 5:36 PM, "Steven Phillips" wrote:
>
> > did you try the form:
> > wher
te(, ) seems to work, although it's verbose.
> >
> > Do I have this right, and is there a less verbose way to handle this?
> >
> > Thanks
> >
>
--
Steven Phillips
Software Engineer
mapr.com
t; >>> 6 rows selected (0.163 seconds)
> >>>
> >>> Listing:3
> >>>
> >>> 0: jdbc:drill:> SELECT t.a,t.b,t.c.x, t.c.y, flatten(t.c.z) from
> dfs.`/data/nested/clicks/sthota_test_1.json` as t;
> >>> ++++++
> >>> | a | b | EXPR$2 | EXPR$3 | EXPR$4 |
> >>> ++++++
> >>> | r1cl1 | r1c2 | 1 | a string | 1 |
> >>> | r2cl1 | r2c2 | 2 | a string | 1 |
> >>> | r2cl1 | r2c2 | 2 | a string | 2 |
> >>> | r3cl1 | r3c2 | 3 | a string | 1 |
> >>> | r3cl1 | r3c2 | 3 | a string | 2 |
> >>> | r3cl1 | r3c2 | 3 | a string | 3 |
> >>> | r4cl1 | r4c2 | 4 | a string | 1 |
> >>> | r4cl1 | r4c2 | 4 | a string | 2 |
> >>> | r4cl1 | r4c2 | 4 | a string | 3 |
> >>> | r4cl1 | r4c2 | 4 | a string | 4 |
> >>> | r5cl1 | r5c2 | 5 | a string | 1 |
> >>> | r5cl1 | r5c2 | 5 | a string | 2 |
> >>> | r5cl1 | r5c2 | 5 | a string | 3 |
> >>> | r5cl1 | r5c2 | 5 | a string | 4 |
> >>> | r5cl1 | r5c2 | 5 | a string | 5 |
> >>> ++++++
> >>> 15 rows selected (0.171 seconds)
> >>>
> >>> Thanks
> >>> Sudhakar Thota
> >
>
--
Steven Phillips
Software Engineer
mapr.com
d be the maximum advisable size
> for a single JSON file? As at some point there will be tradeoff with
> reduced # of files vs maximum size of a single file.
> >
> > Something to consider when using Flume or another tool as data source
> for eventual Drill consumption.
> >
> > —Andries
> >
> >
>
>
--
Steven Phillips
Software Engineer
mapr.com
nBytes method,
> if avgRowSizeInBytes is to large, the return value will be out of int
> range. So the code should be fixed like "return
> ((long)avgRowSizeInBytes)*1024L*1024L".
> Thanks&Regards
--
Steven Phillips
Software Engineer
mapr.com
nswer-23195243
> >
> > If others agree, I think it would be appropriate for Hao to file a JIRA
> to
> > make sure we follow and check this convention.
> >
> > J
> >
> > On Wed, Jan 21, 2015 at 5:07 PM, Steven Phillips >
> > wrote:
>
t;>>>>>
> > >>>>>>>
> > >>>>>>> Error: exception while executing query: Failure while executing
> > >> query.
> > >>>>>>> (state=,code=0)
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> {
> > >>>>>>> "entities": {
> > >>>>>>> "trends": [],
> > >>>>>>> "symbols": [],
> > >>>>>>> "urls": [],
> > >>>>>>> "hashtags": [],
> > >>>>>>> "user_mentions": []
> > >>>>>>> },
> > >>>>>>> "entities": {
> > >>>>>>> "trends": [1,2,3],
> > >>>>>>> "symbols": [4,5,6],
> > >>>>>>> "urls": [7,8,9],
> > >>>>>>> "hashtags": [
> > >>>>>>> {
> > >>>>>>> "text": "GoPatriots",
> > >>>>>>> "indices": []
> > >>>>>>> }
> > >>>>>>> ],
> > >>>>>>> "user_mentions": []
> > >>>>>>> }
> > >>>>>>> }
> > >>>>>>>
> > >>>>>>> The issue seems to be that if some records have arrays with maps
> in
> > >> them
> > >>>>>>> and others are empty.
> > >>>>>>>
> > >>>>>>> —Andries
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> On Jan 21, 2015, at 2:34 PM, Hao Zhu wrote:
> > >>>>>>>
> > >>>>>>>> Seems it works for below json file:
> > >>>>>>>> {
> > >>>>>>>> "entities": {
> > >>>>>>>> "trends": [],
> > >>>>>>>> "symbols": [],
> > >>>>>>>> "urls": [],
> > >>>>>>>> "hashtags": [
> > >>>>>>>> {
> > >>>>>>>> "text": "GoPatriots",
> > >>>>>>>> "indices": [
> > >>>>>>>> 83,
> > >>>>>>>> 94
> > >>>>>>>> ]
> > >>>>>>>> }
> > >>>>>>>> ],
> > >>>>>>>> "user_mentions": []
> > >>>>>>>> },
> > >>>>>>>> "entities": {
> > >>>>>>>> "trends": [1,2,3],
> > >>>>>>>> "symbols": [4,5,6],
> > >>>>>>>> "urls": [7,8,9],
> > >>>>>>>> "hashtags": [
> > >>>>>>>> {
> > >>>>>>>> "text": "GoPatriots",
> > >>>>>>>> "indices": []
> > >>>>>>>> }
> > >>>>>>>> ],
> > >>>>>>>> "user_mentions": []
> > >>>>>>>> }
> > >>>>>>>> }
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> 0: jdbc:drill:> select t.entities.urls from dfs.tmp.`a.json` as
> t
> > >> where
> > >>>>>>>> t.entities.urls is not null;
> > >>>>>>>> ++
> > >>>>>>>> | EXPR$0 |
> > >>>>>>>> ++
> > >>>>>>>> | [7,8,9]|
> > >>>>>>>> ++
> > >>>>>>>> 1 row selected (0.139 seconds)
> > >>>>>>>> 0: jdbc:drill:> select t.entities.urls from dfs.tmp.`a.json` as
> t
> > >> where
> > >>>>>>>> t.entities.urls is null;
> > >>>>>>>> ++
> > >>>>>>>> | EXPR$0 |
> > >>>>>>>> ++
> > >>>>>>>> ++
> > >>>>>>>> No rows selected (0.158 seconds)
> > >>>>>>>>
> > >>>>>>>> Thanks,
> > >>>>>>>> Hao
> > >>>>>>>>
> > >>>>>>>> On Wed, Jan 21, 2015 at 2:01 PM, Aditya <
> adityakish...@gmail.com>
> > >>>>> wrote:
> > >>>>>>>>
> > >>>>>>>>> I believe that this works if the array contains homogeneous
> > >> primitive
> > >>>>>>>>> types. In your example, it appears from the error, the array
> > field
> > >>>>>>> 'member'
> > >>>>>>>>> contained maps for at least one record.
> > >>>>>>>>>
> > >>>>>>>>> On Wed, Jan 21, 2015 at 1:57 PM, Christopher Matta <
> > >> cma...@mapr.com>
> > >>>>>>>>> wrote:
> > >>>>>>>>>
> > >>>>>>>>>> Trying that locally did not work for me (drill 0.7.0):
> > >>>>>>>>>>
> > >>>>>>>>>> 0: jdbc:drill:zk=local> select `id`, `name`, `members` from
> > >>>>>>>>> `Downloads/test.json` where repeated_count(`members`) > 0;
> > >>>>>>>>>> Query failed: Query stopped., Failure while trying to
> > materialize
> > >>>>>>>>> incoming schema. Errors:
> > >>>>>>>>>>
> > >>>>>>>>>> Error in expression at index -1. Error: Missing function
> > >>>>>>>>> implementation: [repeated_count(MAP-REPEATED)]. Full
> expression:
> > >>>>>>> --UNKNOWN
> > >>>>>>>>> EXPRESSION--.. [ 47142fa4-7e6a-48cb-be6a-676e885ede11 on
> > >>>>>>> bullseye-3:31010 ]
> > >>>>>>>>>>
> > >>>>>>>>>> Error: exception while executing query: Failure while
> executing
> > >>>>> query.
> > >>>>>>>>> (state=,code=0)
> > >>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>> Chris Matta
> > >>>>>>>>>> cma...@mapr.com
> > >>>>>>>>>> 215-701-3146
> > >>>>>>>>>>
> > >>>>>>>>>> On Wed, Jan 21, 2015 at 4:50 PM, Aditya <
> > adityakish...@gmail.com
> > >>>
> > >>>>>>> wrote:
> > >>>>>>>>>>
> > >>>>>>>>>>> repeated_count('entities.urls') > 0
> > >>>>>>>>>>>
> > >>>>>>>>>>> On Wed, Jan 21, 2015 at 1:46 PM, Andries Engelbrecht <
> > >>>>>>>>>>> aengelbre...@maprtech.com> wrote:
> > >>>>>>>>>>>
> > >>>>>>>>>>>> How do you filter out records with an empty array in drill?
> > >>>>>>>>>>>> i.e some records have "url":[] and some will have an array
> > >> with
> > >>>>> data
> > >>>>>>>>> in
> > >>>>>>>>>>>> it. When trying to read records with data in the array drill
> > >> fails
> > >>>>>>> due
> > >>>>>>>>>>> to
> > >>>>>>>>>>>> records missing any data in the array. Trying a filter
> with/*
> > >> where
> > >>>>>>>>>>>> "url":[0] is not null */ fails, also fails if applying url
> is
> > >> not
> > >>>>>>>>> null.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> Note some of the arrays contains maps, using twitter data as
> > an
> > >>>>>>>>> example
> > >>>>>>>>>>>> below. Some records have an empty array with “hashtags”:[]
> > and
> > >>>>>>> others
> > >>>>>>>>>>> will
> > >>>>>>>>>>>> look similar to what is listed below.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> "entities": {
> > >>>>>>>>>>>> "trends": [],
> > >>>>>>>>>>>> "symbols": [],
> > >>>>>>>>>>>> "urls": [],
> > >>>>>>>>>>>> "hashtags": [
> > >>>>>>>>>>>> {
> > >>>>>>>>>>>> "text": "GoPatriots",
> > >>>>>>>>>>>> "indices": [
> > >>>>>>>>>>>> 83,
> > >>>>>>>>>>>> 94
> > >>>>>>>>>>>> ]
> > >>>>>>>>>>>> }
> > >>>>>>>>>>>> ],
> > >>>>>>>>>>>> "user_mentions": []
> > >>>>>>>>>>>> },
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> Thanks
> > >>>>>>>>>>>> —Andries
> > >>>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>
> > >>>>>>>
> > >>>>>
> > >>>>>
> > >>>
> > >>>
> > >>>
> > >>
> > >>
> >
> >
>
--
Steven Phillips
Software Engineer
mapr.com
e drill-bit
> nodes
> > > > participated in the query execution?
> > > >
> > > >
> > > > ---
> > > > Mufeed Usman
> > > > My LinkedIn <http://www.linkedin.com/pub/mufeed-usman/28/254/400> |
> My
> > > > So
node MapR cluster.
>
>
> ---
> Mufeed Usman
> My LinkedIn <http://www.linkedin.com/pub/mufeed-usman/28/254/400> | My
> Social Cause <http://www.vision2016.org.in/> | My Blogs : LiveJournal
> <http://mufeed.livejournal.com>
>
--
Steven Phillips
Software Engineer
mapr.com
;>>
> >>> to change the storage format.
> >>>
> >>>
> >>>
> >>> On Tue, Jan 6, 2015 at 12:35 PM, Sungwook Yoon
> >> wrote:
> >>>
> >>>> Hi
> >>>>
> >>>> I am trying to save the query as csv
> >>>>
> >>>> So, I am doing
> >>>>
> >>>> create table as dfs.tmp.`/tmp.csv` select ..
> >>>>
> >>>> It creates a parquet file.
> >>>> Why did it not create csv file?
> >>>>
> >>>> Thanks,
> >>>>
> >>>> Sungwook
> >>>>
> >>>
> >>
>
>
--
Steven Phillips
Software Engineer
mapr.com
t;
> Thanks!
> Paul
>
> This email and any attachments may contain confidential and proprietary
> information of Blackboard that is for the sole use of the intended
> recipient. If you are not the intended recipient, disclosure, copying,
> re-distribution or other use of any of this information is strictly
> prohibited. Please immediately notify the sender and delete this
> transmission if you received this email in error.
>
--
Steven Phillips
Software Engineer
mapr.com
q-core-0.9-drill-r8.jar:na]
> >>> > > > at
> >>> > > >
> >>> > >
> >>> >
> >>>
> org.eigenbase.relopt.volcano.RelSubset.propagateCostImprovements(RelSubset.java:314)
> >>> > > > ~[optiq-
zip command before get
> >> data?
> >> Or can you add plugin for zipped files?
> >> ---
> >> Best regards,
> >> Dima Pl
> >
> >
> > --
> > *Jim Scott*
> > Director, Enterprise Strategy & Architecture
> > +1 (347) 746-9281
> >
> > <http://www.mapr.com/>
> > [image: MapR Technologies] <http://www.mapr.com>
>
--
Steven Phillips
Software Engineer
mapr.com
le executing query: Failure while executing query.
>> (state=,code=0)
>>
>>
>> What's going on?
>> Drillbit reads the column family information correctly, just does not go
>> to
>> columns.
>> Thanks,
>>
>> Sungwook
>>
>>
>
--
Steven Phillips
Software Engineer
mapr.com
t 12:50 PM, Aditya > wrote:
>
> > On Thu, Dec 11, 2014 at 12:47 PM, Sungwook Yoon >
> > wrote:
> >
> > > The sqlline is executed under root too.
> >
> >
> > What about DrillBits? Or are you running Drill in embedded mode?
> >
>
--
Steven Phillips
Software Engineer
mapr.com
gt;
> Sungwook
>
>
> On Thu, Dec 11, 2014 at 12:43 PM, Steven Phillips >
> wrote:
>
> > If it's returning 0 records, but no exception is thrown, my first guess
> > would be that there is a permission issue.
> >
> > On Thursday, December 11, 2014
urns correct column family names
> >
> > But, select * from excel;
> > does not work.
> >
> > What should I look at?
> >
> > Thanks,
> >
> > Sungwook
> >
>
--
Steven Phillips
Software Engineer
mapr.com
r the directory '/tmp'. But drill seems
> > cannot discovery it automatically. We must recreate the view based on the
> > contents of view file and register the views.
> >
> > Is there some simpler way to force drill to register all of them?
> >
> > Thanks.
> >
>
--
Steven Phillips
Software Engineer
mapr.com
gt;> >> net.sourceforge.squirrel_sql.client.Version.(Version.java:34)
> >> >> >>at net.sourceforge.squirrel_sql.client.Main.main(Main.java:60)
> >> >> >>
> >> >> >> If I replace the log4j.jar with the one from drill, I get
> >> >> >> Exception in thread "main" java.lang.IncompatibleClassChangeError:
> >> >> >> Implementing class
> >> >> >>at java.lang.ClassLoader.defineClass1(Native Method)
> >> >> >>at java.lang.ClassLoader.defineClass(ClassLoader.java:800)
> >> >> >>at
> >> >>
> java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
> >> >> >>at java.net.URLClassLoader.defineClass(URLClassLoader.java:449)
> >> >> >>at java.net.URLClassLoader.access$100(URLClassLoader.java:71)
> >> >> >>at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
> >> >> >>at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
> >> >> >>at java.security.AccessController.doPrivileged(Native Method)
> >> >> >>at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
> >> >> >>at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
> >> >> >>at
> sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
> >> >> >>at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
> >> >> >>at
> >> >>
> >>
> net.sourceforge.squirrel_sql.client.SquirrelLoggerFactory.(SquirrelLoggerFactory.java:47)
> >> >> >>at
> net.sourceforge.squirrel_sql.client.Main.startApp(Main.java:80)
> >> >> >>at net.sourceforge.squirrel_sql.client.Main.main(Main.java:73)
> >> >> >>
> >> >> >>
> >> >> >> Any suggestions?
> >> >> >>
> >> >> >> Thanks in advance.
> >> >> >
> >> >>
> >>
>
--
Steven Phillips
Software Engineer
mapr.com
? The only reason I
can think of for doing this is if you are adding UDFs, or have implemented
your own storage plugin.
On Mon, Dec 1, 2014 at 1:21 PM, Steven Phillips
wrote:
> Yes, drill-env.sh would be the place to put this, regardless of how Drill
> is deployed.
>
> On Mon, Dec 1,
it is properly available to the service ?
>
> — David
>
>
--
Steven Phillips
Software Engineer
mapr.com
69 matches
Mail list logo