Hi John,
I have not played with the stride length. Based on my understanding of the
code, since the stride length determines the number of rows between index
entries, if you decrease the stride length, you can get more fine-grained
indexes which can potentially help you to skip more unnecessary ro
Yin -
Fantastic! That is exactly the type of explanation of settings I'd like to
see. More than just what it does, but the tradeoffs, and how things are
applied in the real world. Have you played with the stride length at all?
On Wed, Nov 13, 2013 at 1:13 PM, Yin Huai wrote:
> Hi John,
>
> He
Hi John,
Here is my experience on the stripe size. For a given table, when the
stripe size is increased, the size of a column in a stripe increases, which
means the ORC reader can read a column from disks in a more efficient way
because the reader can sequentially read more data (assuming the read
If you get some useful advice, let's improve the doc.
-- Lefty
On Tue, Nov 12, 2013 at 6:15 PM, John Omernik wrote:
> I am looking for guidance (read examples) on tuning ORC settings for my
> data. I see the documentation that shows the defaults, as well as a brief
> description of what it is
I am looking for guidance (read examples) on tuning ORC settings for my
data. I see the documentation that shows the defaults, as well as a brief
description of what it is. What I am looking for is some examples of
things to try. *Note: I understand that nobody wants to make sweeping
declaring o