Instead of defining a hard-coded set of prefixes, suffixes, and/or
patterns, can we give users some kind of configuration parameter
somewhere?
Perhaps the file-system plug-in should have a configuration parameter
that is a list of "glob" or regular-expression patterns specifying
names to ignore,
Is there a reason why 'week' isn't supported? If not, what's the most
sensible way to group timestamped data by week?
Well, there's this minimal info with a pointer to the doc below:
http://drill.apache.org/docs/plugin-configuration-basics/#storage-plugin-rest-api
This link was mentioned on the user list a few days ago and the
Apache Drill docs now include a pointer to it:
https://docs.google.com/document/d/1mRs
Here is a Google doc:
https://docs.google.com/document/d/1mRsuWk4Dpt6ts-jQ6ke3bB30PIwanRiCPfGxRwZEQME/edit
It may be time for us to add this to the Drill docs.
Bob
On Thu, Oct 15, 2015 at 4:03 PM, Christopher Matta wrote:
> What about general REST API reference documentation? What endpoints ar
What about general REST API reference documentation? What endpoints are
available?
On Thursday, October 15, 2015, Kristine Hahn wrote:
> I haven't seen any info about REST and impersonation, but the REST API
> documentation pertaining to authentication for the Web Console is on Web
> Console and
I don't think any such thing is required. I'm not sure why you still see
the issue. After you updated the storage plug-in, can you confirm if the
changes did take effect? Also what version of Drill are you using?
On Thu, Oct 15, 2015 at 12:23 PM, John Omernik wrote:
> No I added, the bin extensi
I haven't seen any info about REST and impersonation, but the REST API
documentation pertaining to authentication for the Web Console is on Web
Console and REST API Privileges
on
http://doc.mapr.com/display/MapR/Configuring+Web+Console+and+REST+API+Security
and not in the Apache Drill docs because
I can't seem to find any reference documentation around the REST API on
https://drill.apache.org/docs/, am I missing where it is?
I'd like to know how you can pass a user along with the query when
impersonation is enabled, also how would you authenticate when
authentication is enabled?
Chris Matt
You can download the binary release of 1.2 here:
http://people.apache.org/~adeneche/apache-drill-1.2.0-rc3/
On Thu, Oct 15, 2015 at 10:37 AM, Ted Dunning wrote:
> MapR has pushed an independent advanced release of Drill 1.2-ish that
> likely has this fix.
>
>
>
> On Thu, Oct 15, 2015 at 7:56 AM
No I added, the bin extension, updated the storage plugin, then tried the
query... do I need to relogin to sqlline for things to take effect?
On Thu, Oct 15, 2015 at 1:50 PM, Abhishek Girish
wrote:
> You'll get a "file not found" error if Drill cannot recognize an extension
> (**). So if you tri
You'll get a "file not found" error if Drill cannot recognize an extension
(**). So if you tried querying a file with say .bin extension before you
added "bin" as an extension to the json format plugin (and did not specify
the default input format), you'd hit that issue.
Can you try once more, aft
That's on me, I thought I had typed good json, but apparently I did not. I
got an invalid json format and I assumed that specifying extensions there
was not valid.
That said, when I tried to select a file, or a directory I am get "file not
found" with the .bin extension, yet I know it to be there
I have had problems with hyphens in the s3 names in the past when used with
drill. try removing the hyphen.
SCott
> On Oct 15, 2015, at 11:03 AM, Jason Jho wrote:
>
> I've enabled JetS3t and configured a storage plugin for S3, but I'm having
> trouble connecting to an S3 bucket URI that the pa
I'm attaching the patch that I put together for decimal.
This includes:
* Decimal schema translation from Avro to Parquet
- Need to add date, time, timestamp
- Need to add Parquet to Avro support
* Read-side support for any Avro logical type
* Special write-side support for decimal
- This w
MapR has pushed an independent advanced release of Drill 1.2-ish that
likely has this fix.
On Thu, Oct 15, 2015 at 7:56 AM, Kamesh wrote:
> Hi James,
> Which version of Drill are you using?. Also whether any of the documents
> contain field of data type timestamp or date?.
> If so, recently
Hi Ryan
Thanks for this - it sounds just what we need.
How do we go about doing a trial of the local copies with our code ?
It would be good to check this all out now if 1.8.0 is delayed for a while ?
contact me by https://drillers.slack.com/messages/dev/team/cmathews/ to discuss.
Cheers — Chri
thanks Ryan!
(cc parquet dev list as well)
On Thu, Oct 15, 2015 at 9:46 AM, Ryan Blue wrote:
> Hi Chris,
>
> Avro does have support for dates, but it hasn't been released yet because
> 1.8.0 was blocked on license issues (AVRO-1722). I have a branch with
> preliminary parquet-avro support for De
Hi Chris,
You could probably contribute some sort of type annotation to parquet-avro
so that it produces the data type in the Parquet schema.
This class generates a Parquet schema from the Avro schema:
https://github.com/apache/parquet-mr/blob/master/parquet-avro/src/main/java/org/apache/parquet/av
Hey all -
I am trying to demonstrate a neat use case. Using audit logs in MapR, I'd
like to be able to point Drill at the directory, and just go, no loading of
data, just go.
The problem I am having is how to describe the path. First how logs are
stored.
>From the base of MapRFS
/var/mapr/loca
I've enabled JetS3t and configured a storage plugin for S3, but I'm having
trouble connecting to an S3 bucket URI that the parser doesn't seem to
like. I know it's a valid S3 bucket since it doesn't violate any of the
naming conventions and can access the data via s3cmd.
The error I am seeing is:
I hit that same issue many times but the problem resolved after I upgraded
to a pre release of Drill 1.2
On Thu, Oct 15, 2015 at 9:21 AM, Jacques Nadeau wrote:
> I believe that this error is due to an incompatibility between Mongo's
> Extended JSON support and Drill's extended JSON support that
Thank you Jacques - yes this is exactly the issue I am having.
We are currently using Avro to define schemas for our Parquet files, and as you
correctly point out there is no way of defining date types in Avro. Due to the
volumes of data we are dealing with, using CTAS is not an option for us a
That should have worked! Also, I did try it out now:
*Data:*
# cat abc.bin
{"abc":"123", "pqr":"789"}
*Format Plugin:*
"json": {
"type": "json",
"extensions": [
"json",
"bin"
]
}
*Query:*
> select * from dfs.tmp.`abc.bin`;
+--+--+
| abc | pqr |
Hi Jacques,
You describe my small dilemma much better than me, thank you.
Regards,
-Stefan
On Thu, Oct 15, 2015 at 3:19 PM, Jacques Nadeau wrote:
> A little clarification here:
>
> Parquet has native support for date types. Drill does too. However, since
> Avro does not, there is no way that
I believe that this error is due to an incompatibility between Mongo's
Extended JSON support and Drill's extended JSON support that is fixed in
1.2 (release imminent e.g. next 24-48 hours). If you want to try out the
fix immediately, you would need to grab the master branch of Drill and
build it yo
A little clarification here:
Parquet has native support for date types. Drill does too. However, since
Avro does not, there is no way that I know of to write a Parquet file via
the Avro adapter that will not require a cast. If you did a CTAS in Drill
and cast the data types correctly in the CTAS,
Hi James,
Which version of Drill are you using?. Also whether any of the documents
contain field of data type timestamp or date?.
If so, recently we have fixed issues in handling timestamp/date and binary
date type.
These fixes are targeted for *1.2.0* release, which will happen very soon.
On T
Hi,
I am trying to query a collection in mongo directly, using this query:
select * from mongo.omegatestbed.testshard4 where ENTITY_ID = 1216515 limit 1;
ENTITY_ID is indexed, ascending. The columns in the collection may have
differing data types for the same column name. I get this error:
E
Hi Chris,
I understand now, thank you.
What threw me off was that, in our standard use-case, we are not using cast
for our TIMESTAMP_MILLIS fields and I thought we were getting them directly
formatted from Parquet but then I overlooked our UDF that is handling the
casting... sorry :).
Thank you
Hi Stefan
I am not sure I fully understand your question 'why you don't seem to be
storing your dates in Parquet Date files.'
As far as I am aware all date types in Parquet (ie: DATE, TIME_MILLIS,
TIMESTAMP_MILLIS) are all stored as either in int32 or int64 annotated types.
The only other opti
Thank you Chris, this clarifies a whole lot :).
I wanted to try to avoid the cast in the CTAS on the way from Avro to
Parquet (not possible) and then avoid casting as much as possible when
selecting from the Parquet files.
What is still unclear to me is why you don't seem to be storing your dates
Hey all,
I have some json files that are written out in with a .bin extension.
(Process not under my control). In drill I am able to create a workspace
that uses a default input type of json, and this is able to read with no
issues, but I'd like to be able specify that .bin should also be read as
Hello Stefan
We use Avro to define our schemas for Parquet files, and we find that using
long for dates and converting the dates to long using milliseconds works. We
then CAST the long to a TIMESTAMP on the way out during the SELECT statement
(or by using a VIEW).
example java snippet:
// v
33 matches
Mail list logo