Hello,
One question,
How to verify whether predicate pushdown is happening ?
I have one parquet file generated using CTAS command. I have executed
REFRESH METADATA. I am firing a simple query with a WHERE clause. In the
physical plan for the scan operation, i see rowcount as total number of
tuning. Which also means that you
> have to revisit the kinds of analytics you would like your end users to
> have. Which again raises the question-what kinds of analytics truly
> generate value for the BI user?
>
> Best,
> Saurabh
>
> On Wed, Oct 18, 2017 at 10:26 PM, PROJJWAL
Hi,
Is there any public performance benchmark that users have achieved using
Drill in production scenarios ? It would be useful if someone can pass me
any links for customer user stories.
Regards
are the link .
>
> Did Parth's suggestion of
> store.parquet.reader.pagereader.bufferedread=false
> resolve the issue?
>
> Also share the details of the hardware setup... #nodes, Hadoop version,
> etc.
>
>
> -----Original Message-
> From: PROJJWAL SAHA [mailto:pr
minimal data file that triggers this?
>
> You can also try turning off the buffering reader.
>store.parquet.reader.pagereader.bufferedread=false
>
> With async reader on and buffering off, you might not see any degradation
> in performance in most cases.
>
>
>
> On T
ava.nio.Buffer.checkBounds(Buffer.java:567) ~[na:1.8.0_121]
at java.nio.ByteBuffer.put(ByteBuffer.java:827) ~[na:1.8.0_121]
at java.nio.DirectByteBuffer.put(DirectByteBuffer.java:379)
~[na:1.8.0_121]
at
org.apache.parquet.hadoop.util.CompatibilityUtil.getBuf(CompatibilityUtil.
t;
> Can you try disabling async parquet reader to see if problem gets resolved.
>
>
> alter session set `store.parquet.reader.pagereader.async`=false;
>
> Thanks,
>
> Arjun
>
>
>
> From: PROJJWAL SAHA <proj.s...@gmail.com>
&g
I get below exception when querying parquet data on Oracle Storage Cloud
service.
Any pointers on what does this point to ?
Regards,
Projjwal
ERROR o.a.d.e.u.f.BufferedDirectBufInputStream - Error reading from stream
part-6-25a9ae4b-fd9e-4770-b17e-9a29b270a4c2.parquet. Error was : null
y you used in the query i.e. `data25Goct6/websales` ?
>
> Thanks
> Padma
>
>
> On Oct 9, 2017, at 5:50 AM, PROJJWAL SAHA <proj.s...@gmail.com<mailto:pr
> oj.s...@gmail.com>> wrote:
>
> Hello all,
>
> I am getting the below exception when querying parqu
Hello all,
I am getting the below exception when querying parquet data stored in
storage cloud service.What does this exception point to ?
The query on the same parquet files works when they are stored in
alluxio.which means the data is fine.
I am using drill 11.1
Any help is appreciated !
and put it in the classpath of drill to enable me to debug the code
at runtime.
Please help me here.
Regards,
Projjwal
On Sun, Mar 19, 2017 at 10:43 PM, PROJJWAL SAHA <proj.s...@gmail.com> wrote:
> Hi all,
>
> I am trying to debug a 3rd party storage plugin and I need to enable
Hi all,
I am trying to debug a 3rd party storage plugin and I need to enable debug
with my eclipse IDE. Can someone pls guide me on the steps to enable
debugging for eclipse - any documentation / link would also help. Also are
the steps same if I would want to debug drill codebase ?
Regards,
far...@mapr.com> wrote:
> Three million rows is too many rows, for sqlline to print.
>
> Try doing a COUNT(*) and see if that query returns the correct count on
> that table.
>
>
> Thanks,
>
> Khurram
>
> ____
> From: PROJJWAL SAHA <pro
All,
I am using drillconf from command line to display a query result like
select * from xxx
having 3 million rows. The screen display scrolls fast to display the
result, however, it stops after some time with this exception -
java.lang.NegativeArraySizeException
at
All,
one question
i am querying on .gz.parquet files.
select * from xxx returns data like
+-+
| current |
+-+
|
t thing I would try is to make your cluster a single node
> cluster first and then run the same explain plan query separately on each
> individual file.
>
>
>
> On Mar 7, 2017 5:09 AM, "PROJJWAL SAHA" <proj.s...@gmail.com> wrote:
>
> > Hi Rahul,
> >
dfs storage plugin.
Query planning time is approx 30 secs
Query execution time is apprx 1.5 secs
Regards,
Projjwal
-- Forwarded message --
From: PROJJWAL SAHA <proj.s...@gmail.com>
Date: Fri, Mar 3, 2017 at 5:06 PM
Subject: Minimise query plan time for dfs plugin for loca
tributed cluster is having some effect on the planning...
>
> On Fri, Mar 3, 2017 at 6:08 AM, PROJJWAL SAHA <proj.s...@gmail.com> wrote:
>
> > I did not change the default values used by drill.
> > Are you talking of changing planner.memory_limit
> > and planner.memory.m
wrote:
> how much memory have you set for planner ?
>
> On Fri, Mar 3, 2017 at 5:06 PM, PROJJWAL SAHA <proj.s...@gmail.com> wrote:
>
> > Hello all,
> >
> > I am quering select * from dfs.xxx where yyy (filter condition)
> >
> > I am using dfs storage pl
Hello,
I am doing select * query on a csv file of 1 GB with a 5 node drill
cluster. The csv file is stored in another storage cluster within the
enterprise.
In the query profile, I see one major fragment and within the major
fragment, I see only 1 minor fragment. The hostname for the minor
gt; to move them into a single region.
>
> In either case, from AWS console you can figure out how much network
> throughput you are getting if that is the bottleneck
> Also drill machines would need CPU so along with 32GB memory if you have 8
> cores that would be desirable
>
>
l server is
>
> On Mon, Feb 20, 2017 at 5:37 PM, PROJJWAL SAHA <proj.s...@gmail.com>
> wrote:
>
> > Hello all,
> >
> > I am using 1GB data in the form of .tsv file, stored in Amazon S3 using
> > Drill 1.8. I am using default configurations of Drill using S3
Hello all,
I am using 1GB data in the form of .tsv file, stored in Amazon S3 using
Drill 1.8. I am using default configurations of Drill using S3 storage
plugin coming out of the box. The drill bits are configured on a 5 node
cluster with 32GB RAM and 4VCPU.
I see that select * from xxx; query
23 matches
Mail list logo