It's just a flat select (via a view) . basically select field1, field2,....
field100 from view_mytable where dir0 = '2016-05-01' there is no
aggregation or anything happening.

As to the dictionary encoding and the new reader some thoughts:

1. Based on what I've read, the new reader is faster for flat data, in my
case, it's the only thing that is allowing me to read the data created in a
CDH cluster with a map reduce job. The "old" reader gives me the array
index out of bounds (see other thread).  So in order to clean up my data,
I'd like to use the new reader here, however, now you have me worried about
incorrect data.

2. The files are already dictionary encoded, when I do the CTAS without the
encoding, the result is the files are quite a bit bigger than the original
files. Not a huge issue, but substantial (10-20 GB per day).   Thats why I
tried to combine the two.

3.  I am now worried about both the encoding/reader for incorrect data...
Are there any JIRA's etc with status on this and warnings on their use?

Thanks!

John


On Tue, May 31, 2016 at 11:02 AM, Abdel Hakim Deneche <adene...@maprtech.com
> wrote:

> My understanding (which is incomplete) is that both the "new reader" and
> "dictionary encoding" are not stable yet and can cause failures or worse,
> incorrect data. That's why they are disabled by default.
>
> The "Allocation Failure" means that the JVM had to run a Full GC because it
> couldn't allocate more heap for Drill. Looks like Drill is using more that
> 24GB of heap, which is most likely a bug.
>
> What happens if you run the select part of the CTAS, does it also use too
> much heap ?
>
>
> On Tue, May 31, 2016 at 8:54 AM, John Omernik <j...@omernik.com> wrote:
>
> > Oh, the query just stopped showing up in the profiles webui, completely
> > gone like it never happened. Seems to be responding a bit better, the
> > sqlline is still hung though.
> >
> > (Yes this is all related to my CTAS of the parquet data, at this point I
> am
> > just looking for ways to handle the data and not make drill really
> unhappy.
> > )
> >
> > On Tue, May 31, 2016 at 10:51 AM, John Omernik <j...@omernik.com> wrote:
> >
> > > Also: Doing a CTAS using the new reader and dictionary encoding is
> > > producing this, everything is hung at this point. The query in sqlline
> is
> > > not returning, the web UI is running extremely slowly, and when it does
> > > return, shows the running query, however, when I click on it, the
> profile
> > > shows an error saying profile not found.  The Full GCs are happening
> > quite
> > > a bit, and take a long time (>10 seconds) And (this is my tailed
> gcclog,
> > > it's actually writing part of the the "allocation error" message and
> then
> > > waits a before anything else happens. This is "the scary" state my
> > cluster
> > > can get into, and I am trying to avoid this :) Any tips on what may be
> > > happening here would be appreciated.
> > >
> > > (24 GB of Heap, 5 nodes at this point)
> > >
> > >
> > >
> > >
> > >
> > > 912.895: [Full GC (Allocation Failure)  23G->20G(24G), 11.7923015 secs]
> > >
> > > 2924.692: [GC concurrent-mark-abort]
> > >
> > > 2925.099: [GC pause (G1 Evacuation Pause) (young) 22G->21G(24G),
> > 0.0540177
> > > secs]
> > >
> > > 2925.401: [GC pause (G1 Evacuation Pause) (young) (initial-mark)
> > > 22G->21G(24G), 0.0638409 secs]
> > >
> > > 2925.465: [GC concurrent-root-region-scan-start]
> > >
> > > 2925.475: [GC concurrent-root-region-scan-end, 0.0097528 secs]
> > >
> > > 2925.475: [GC concurrent-mark-start]
> > >
> > > 2925.846: [GC pause (G1 Evacuation Pause) (young) 22G->21G(24G),
> > 0.0454322
> > > secs]
> > >
> > > 2926.252: [GC pause (G1 Evacuation Pause) (young) 22G->21G(24G),
> > 0.0543209
> > > secs]
> > >
> > > 2926.604: [GC pause (G1 Evacuation Pause) (young) 22G->21G(24G),
> > 0.0525408
> > > secs]
> > >
> > > 2926.986: [GC pause (G1 Evacuation Pause) (young) 22G->21G(24G),
> > 0.0534530
> > > secs]
> > >
> > > 2927.389: [GC concurrent-mark-end, 1.9133249 secs]
> > >
> > > 2927.405: [GC remark, 0.0446448 secs]
> > >
> > > 2927.462: [GC cleanup 22G->22G(24G), 0.0290235 secs]
> > >
> > > 2927.494: [GC concurrent-cleanup-start]
> > >
> > > 2927.494: [GC concurrent-cleanup-end, 0.0000190 secs]
> > >
> > > 2927.530: [GC pause (G1 Evacuation Pause) (young) 22G->21G(24G),
> > 0.0500267
> > > secs]
> > >
> > > 2927.828: [GC pause (G1 Evacuation Pause) (young) 22G->21G(24G),
> > 0.0462845
> > > secs]
> > >
> > > 2928.184: [GC pause (G1 Evacuation Pause) (young) (initial-mark)
> > > 22G->21G(24G), 0.0749704 secs]
> > >
> > > 2928.259: [GC concurrent-root-region-scan-start]
> > >
> > > 2928.268: [GC concurrent-root-region-scan-end, 0.0093531 secs]
> > >
> > > 2928.268: [GC concurrent-mark-start]
> > >
> > > 2928.568: [GC pause (G1 Evacuation Pause) (young) 22G->22G(24G),
> > 0.0555025
> > > secs]
> > >
> > > 2928.952: [GC pause (G1 Evacuation Pause) (young) 23G->22G(24G),
> > 0.0489993
> > > secs]
> > >
> > > 2929.333: [GC pause (G1 Evacuation Pause) (young)-- 23G->22G(24G),
> > > 0.0676159 secs]
> > >
> > > 2929.693: [GC pause (G1 Evacuation Pause) (young)-- 23G->23G(24G),
> > > 0.2088768 secs]
> > >
> > > 2929.914: [Full GC (Allocation Failure)  23G->20G(24G), 11.6264600
> secs]
> > >
> > > 2941.544: [GC concurrent-mark-abort]
> > >
> > > 2941.836: [GC pause (G1 Evacuation Pause) (young) 22G->20G(24G),
> > 0.0416962
> > > secs]
> > >
> > > 2942.127: [GC pause (G1 Evacuation Pause) (young) (initial-mark)
> > > 22G->21G(24G), 0.0627406 secs]
> > >
> > > 2942.190: [GC concurrent-root-region-scan-start]
> > >
> > > 2942.193: [GC concurrent-root-region-scan-end, 0.0029795 secs]
> > >
> > > 2942.193: [GC concurrent-mark-start]
> > >
> > > 2942.548: [GC pause (G1 Evacuation Pause) (young) 22G->21G(24G),
> > 0.0591030
> > > secs]
> > >
> > > 2942.934: [GC pause (G1 Evacuation Pause) (young) 22G->21G(24G),
> > 0.0589163
> > > secs]
> > >
> > > 2943.304: [GC pause (G1 Evacuation Pause) (young) 22G->21G(24G),
> > 0.0459117
> > > secs]
> > >
> > > 2943.743: [GC pause (G1 Evacuation Pause) (young) 22G->21G(24G),
> > 0.0461640
> > > secs]
> > >
> > > 2943.941: [GC concurrent-mark-end, 1.7476855 secs]
> > >
> > > 2943.953: [GC remark, 0.0356995 secs]
> > >
> > > 2944.000: [GC cleanup 22G->22G(24G), 0.0307393 secs]
> > >
> > > 2944.034: [GC concurrent-cleanup-start]
> > >
> > > 2944.034: [GC concurrent-cleanup-end, 0.0000281 secs]
> > >
> > > 2944.162: [GC pause (G1 Evacuation Pause) (young) 22G->21G(24G),
> > 0.0558067
> > > secs]
> > >
> > > 2944.510: [GC pause (G1 Evacuation Pause) (young) 22G->21G(24G),
> > 0.0497960
> > > secs]
> > >
> > > 2944.837: [GC pause (G1 Evacuation Pause) (young) (initial-mark)
> > > 22G->21G(24G), 0.0719856 secs]
> > >
> > > 2944.909: [GC concurrent-root-region-scan-start]
> > >
> > > 2944.917: [GC concurrent-root-region-scan-end, 0.0076375 secs]
> > >
> > > 2944.917: [GC concurrent-mark-start]
> > >
> > > 2945.204: [GC pause (G1 Evacuation Pause) (young) 22G->21G(24G),
> > 0.0476954
> > > secs]
> > >
> > > 2945.604: [GC pause (G1 Evacuation Pause) (young) 22G->21G(24G),
> > 0.0438138
> > > secs]
> > >
> > > 2945.940: [GC pause (G1 Evacuation Pause) (young) 23G->22G(24G),
> > 0.0554554
> > > secs]
> > >
> > > 2946.358: [GC pause (G1 Evacuation Pause) (young) 23G->22G(24G),
> > 0.0502923
> > > secs]
> > >
> > > 2946.708: [GC pause (G1 Evacuation Pause) (young)-- 23G->22G(24G),
> > > 0.0728342 secs]
> > >
> > > 2947.021: [GC pause (G1 Evacuation Pause) (young)-- 23G->23G(24G),
> > > 0.1938188 secs]
> > >
> > > 2947.227: [Full GC (Allocation Failure)
> > >
> > > On Tue, May 31, 2016 at 10:20 AM, John Omernik <j...@omernik.com>
> wrote:
> > >
> > >> I am looking at the GC logs for some big queries (now that I know how
> to
> > >> enabled them, thanks Paul!)  and found the item below, it worries, "it
> > >> says, failure, it too 8 seconds) Should I be worried about that? I
> know
> > my
> > >> HEAP is set fairly high (24GB) how should I interpret this?
> > >>
> > >> John
> > >>
> > >>
> > >> 1924.138: [GC concurrent-root-region-scan-end, 0.0019536 secs]
> > >>
> > >> 1924.138: [GC concurrent-mark-start]
> > >>
> > >> 1924.505: [GC pause (G1 Evacuation Pause) (young) 22G->21G(24G),
> > >> 0.0314046 secs]
> > >>
> > >> 1924.853: [GC pause (G1 Evacuation Pause) (young) 22G->21G(24G),
> > >> 0.0466492 secs]
> > >>
> > >> 1925.110: [GC concurrent-mark-end, 0.9724279 secs]
> > >>
> > >> 1925.126: [GC remark, 0.0403063 secs]
> > >>
> > >> 1925.182: [GC cleanup 22G->22G(24G), 0.0117316 secs]
> > >>
> > >> 1925.306: [GC pause (G1 Evacuation Pause) (young) 22G->21G(24G),
> > >> 0.0434484 secs]
> > >>
> > >> 1925.623: [GC pause (G1 Evacuation Pause) (mixed)-- 22G->23G(24G),
> > >> 0.2346237 secs]
> > >>
> > >> 1926.001: [GC pause (G1 Evacuation Pause) (mixed)-- 23G->23G(24G),
> > >> 0.6835194 secs]
> > >>
> > >> 1926.702: [GC pause (G1 Evacuation Pause) (young) 23G->23G(24G),
> > >> 0.0388452 secs]
> > >>
> > >> 1926.756: [GC pause (G1 Evacuation Pause) (young) (initial-mark)
> > >> 23G->23G(24G), 0.0265543 secs]
> > >>
> > >> 1926.783: [GC concurrent-root-region-scan-start]
> > >>
> > >> 1926.783: [GC concurrent-root-region-scan-end, 0.0000162 secs]
> > >>
> > >> 1926.783: [GC concurrent-mark-start]
> > >>
> > >> 1926.798: [GC pause (G1 Evacuation Pause) (young) 23G->23G(24G),
> > >> 0.0397901 secs]
> > >>
> > >> 1926.850: [GC pause (G1 Evacuation Pause) (young) 23G->23G(24G),
> > >> 0.0370946 secs]
> > >>
> > >> 1926.902: [Full GC (Allocation Failure)  23G->18G(24G), 8.3348025
> secs]
> > >>
> > >> 1935.243: [GC concurrent-mark-abort]
> > >>
> > >> 1935.967: [GC pause (G1 Evacuation Pause) (young) 20G->18G(24G),
> > >> 0.0479378 secs]
> > >>
> > >> 1936.733: [GC pause (G1 Evacuation Pause) (young) (initial-mark)
> > >> 20G->18G(24G), 0.0556650 secs]
> > >>
> > >> 1936.789: [GC concurrent-root-region-scan-start]
> > >>
> > >> 1936.794: [GC concurrent-root-region-scan-end, 0.0049816 secs]
> > >>
> > >> 1936.794: [GC concurrent-mark-start]
> > >>
> > >> 1937.526: [GC pause (G1 Evacuation Pause) (young) 20G->18G(24G),
> > >> 0.0528627 secs]
> > >>
> > >> 1937.793: [GC concurrent-mark-end, 0.9993197 secs]
> > >>
> > >> 1937.811: [GC remark, 0.0503934 secs]
> > >>
> > >> 1937.878: [GC cleanup 19G->19G(24G), 0.0124116 secs]
> > >>
> > >> 1938.225: [GC pause (G1 Evacuation Pause) (young) 20G->18G(24G),
> > >> 0.0500627 secs]
> > >>
> > >> 1938.827: [GC pause (G1 Evacuation Pause) (young) 20G->18G(24G),
> > >> 0.0434986 secs]
> > >>
> > >> 1939.352: [GC pause (G1 Evacuation Pause) (young) (initial-mark)
> > >> 20G->18G(24G), 0.0595028 secs]
> > >>
> > >> 1939.412: [GC concurrent-root-region-scan-start]
> > >>
> > >> 1939.415: [GC concurrent-root-region-scan-end, 0.0038086 secs]
> > >>
> > >> 1939.415: [GC concurrent-mark-start]
> > >>
> > >> 1939.911: [GC pause (G1 Evacuation Pause) (young) 20G->19G(24G),
> > >> 0.0394493 secs]
> > >>
> > >
> > >
> >
>
>
>
> --
>
> Abdelhakim Deneche
>
> Software Engineer
>
>   <http://www.mapr.com/>
>
>
> Now Available - Free Hadoop On-Demand Training
> <
> http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available
> >
>

Reply via email to