ok. thanks. so given everything we know the choices i see are:
1. increase your heapsize some more. (And of course confirm your process that your reported the -Xmx8192M is the HiveServer2 process.) 2. modify your query such that it doesn't use "select *" 3. modify your query such that it does its own buffering. maybe stream it? 4. create a jira ticket and request that the internal buffer size that the hiveserver2 uses for staging results be configurable. That's all _i_ got left in the tank for this issue. I think we need an SME who is familiar with the code now. Regards, Stephen. On Tue, Feb 18, 2014 at 10:57 AM, David Gayou <david.ga...@kxen.com> wrote: > Sorry i badly reported it. It's 8192M > > Thanks, > > David. > Le 18 févr. 2014 18:37, "Stephen Sprague" <sprag...@gmail.com> a écrit : > > oh. i just noticed the -Xmx value you reported. >> >> there's no M or G after that number?? I'd like to see -Xmx8192M or >> -Xmx8G. That *is* very important. >> >> thanks, >> Stephen. >> >> >> On Tue, Feb 18, 2014 at 9:22 AM, Stephen Sprague <sprag...@gmail.com>wrote: >> >>> thanks. >>> >>> re #1. we need to find that Hiveserver2 process. For all i know the one >>> you reported is hiveserver1 (which works.) chances are they use the same >>> -Xmx value but we really shouldn't make any assumptions. >>> >>> try wide format on the ps command (eg. ps -efw | grep -i Hiveserver2) >>> >>> re.#2. okay. so that tells us is not the number of columns blowing the >>> heap but rather the combination of rows + columns. There's no way it >>> stores the full result set on the heap even under normal circumstances so >>> my guess is there's an internal number of rows it buffers. sorta like how >>> unix buffers stdout. How and where that's set is out of my league. >>> However, maybe you get around it by upping your heapsize again if you have >>> the available memory of course. >>> >>> >>> On Tue, Feb 18, 2014 at 8:39 AM, David Gayou <david.ga...@kxen.com>wrote: >>> >>>> >>>> 1. I have no process with hiveserver2 ... >>>> >>>> "ps -ef | grep -i hive" return some pretty long command with a >>>> -Xmx8192 and that's the value set in hive-env.sh >>>> >>>> >>>> 2. The "select * from table limit 1" or even 100 is working correctly. >>>> >>>> >>>> David. >>>> >>>> >>>> On Tue, Feb 18, 2014 at 4:16 PM, Stephen Sprague <sprag...@gmail.com>wrote: >>>> >>>>> He lives on after all! and thanks for the continued feedback. >>>>> >>>>> We need the answers to these questions using HS2: >>>>> >>>>> >>>>> >>>>> 1. what is the output of "ps -ef | grep -i hiveserver2" on your >>>>> system? in particular what is the value of -Xmx ? >>>>> >>>>> 2. does "select * from table limit 1" work? >>>>> >>>>> Thanks, >>>>> Stephen. >>>>> >>>>> >>>>> >>>>> On Tue, Feb 18, 2014 at 6:32 AM, David Gayou <david.ga...@kxen.com>wrote: >>>>> >>>>>> I'm so sorry, i wrote an answer, and i forgot to sent it.... >>>>>> And i haven't been able to work on this for a few days. >>>>>> >>>>>> >>>>>> So far : >>>>>> >>>>>> I have a 15k columns table and 50k rows. >>>>>> >>>>>> I do not see any changes if i change the storage. >>>>>> >>>>>> >>>>>> *Hive 12.0* >>>>>> >>>>>> My test query is "select * from bigtable" >>>>>> >>>>>> >>>>>> If i use the hive cli, it works fine. >>>>>> >>>>>> If i use hiveserver1 + ODBC : it works fine >>>>>> >>>>>> If i use hiverserver2 + odbc or hiverserver2 + beeline,i have this >>>>>> java exception : >>>>>> >>>>>> 2014-02-18 13:22:22,571 ERROR thrift.ProcessFunction >>>>>> (ProcessFunction.java:process(41)) - Internal error processing >>>>>> FetchResults >>>>>> >>>>>> java.lang.OutOfMemoryError: Java heap space >>>>>> at java.util.Arrays.copyOf(Arrays.java:2734) >>>>>> at java.util.ArrayList.ensureCapacity(ArrayList.java:167) >>>>>> at java.util.ArrayList.add(ArrayList.java:351) >>>>>> at >>>>>> org.apache.hive.service.cli.thrift.TRow.addToColVals(TRow.java:160) >>>>>> at >>>>>> org.apache.hive.service.cli.RowBasedSet.addRow(RowBasedSet.java:60) >>>>>> at >>>>>> org.apache.hive.service.cli.RowBasedSet.addRow(RowBasedSet.java:32) >>>>>> at >>>>>> org.apache.hive.service.cli.operation.SQLOperation.prepareFromRow(SQLOperation.java:270) >>>>>> at >>>>>> org.apache.hive.service.cli.operation.SQLOperation.decode(SQLOperation.java:262) >>>>>> at >>>>>> org.apache.hive.service.cli.operation.SQLOperation.getNextRowSet(SQLOperation.java:246) >>>>>> at >>>>>> org.apache.hive.service.cli.operation.OperationManager.getOperationNextRowSet(OperationManager.java:171) >>>>>> at >>>>>> org.apache.hive.service.cli.session.HiveSessionImpl.fetchResults(HiveSessionImpl.java:438) >>>>>> at >>>>>> org.apache.hive.service.cli.CLIService.fetchResults(CLIService.java:346) >>>>>> at >>>>>> org.apache.hive.service.cli.thrift.ThriftCLIService.FetchResults(ThriftCLIService.java:407) >>>>>> at >>>>>> org.apache.hive.service.cli.thrift.TCLIService$Processor$FetchResults.getResult(TCLIService.java:1373) >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> *From the SVN trunk* : (for the HIVE-3746) >>>>>> >>>>>> With the maven change, most of the documentation and wiki are out of >>>>>> date. >>>>>> Compiling from trunk was not that easy and i may have failed some >>>>>> steps but : >>>>>> >>>>>> It has the same behavior. It works in CLI and hiveserver1. >>>>>> It fails with hiveserver 2. >>>>>> >>>>>> >>>>>> Regards >>>>>> >>>>>> David Gayou >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> On Thu, Feb 13, 2014 at 3:11 AM, Navis류승우 <navis....@nexr.com> wrote: >>>>>> >>>>>>> With HIVE-3746, which will be included in hive-0.13, HiveServer2 >>>>>>> takes less memory than before. >>>>>>> >>>>>>> Could you try it with the version in trunk? >>>>>>> >>>>>>> >>>>>>> 2014-02-13 10:49 GMT+09:00 Stephen Sprague <sprag...@gmail.com>: >>>>>>> >>>>>>> question to the original poster. closure appreciated! >>>>>>>> >>>>>>>> >>>>>>>> On Fri, Jan 31, 2014 at 12:22 PM, Stephen Sprague < >>>>>>>> sprag...@gmail.com> wrote: >>>>>>>> >>>>>>>>> thanks Ed. And on a separate tact lets look at Hiveserver2. >>>>>>>>> >>>>>>>>> >>>>>>>>> @OP> >>>>>>>>> >>>>>>>>> *I've tried to look around on how i can change the thrift heap >>>>>>>>> size but haven't found anything.* >>>>>>>>> >>>>>>>>> >>>>>>>>> looking at my hiveserver2 i find this: >>>>>>>>> >>>>>>>>> $ ps -ef | grep -i hiveserver2 >>>>>>>>> dwr 9824 20479 0 12:11 pts/1 00:00:00 grep -i >>>>>>>>> hiveserver2 >>>>>>>>> dwr 28410 1 0 00:05 ? 00:01:04 >>>>>>>>> /usr/lib/jvm/java-6-sun/jre/bin/java >>>>>>>>> *-Xmx256m*-Dhadoop.log.dir=/usr/lib/hadoop/logs >>>>>>>>> -Dhadoop.log.file=hadoop.log >>>>>>>>> -Dhadoop.home.dir=/usr/lib/hadoop -Dhadoop.id.str= >>>>>>>>> -Dhadoop.root.logger=INFO,console >>>>>>>>> -Djava.library.path=/usr/lib/hadoop/lib/native >>>>>>>>> -Dhadoop.policy.file=hadoop-policy.xml -Djava.net.preferIPv4Stack=true >>>>>>>>> -Dhadoop.security.logger=INFO,NullAppender >>>>>>>>> org.apache.hadoop.util.RunJar >>>>>>>>> /usr/lib/hive/lib/hive-service-0.12.0.jar >>>>>>>>> org.apache.hive.service.server.HiveServer2 >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> questions: >>>>>>>>> >>>>>>>>> 1. what is the output of "ps -ef | grep -i hiveserver2" on your >>>>>>>>> system? in particular what is the value of -Xmx ? >>>>>>>>> >>>>>>>>> 2. can you restart your hiveserver with -Xmx1g? or some value >>>>>>>>> that makes sense to your system? >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> Lots of questions now. we await your answers! :) >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> On Fri, Jan 31, 2014 at 11:51 AM, Edward Capriolo < >>>>>>>>> edlinuxg...@gmail.com> wrote: >>>>>>>>> >>>>>>>>>> Final table compression should not effect the de serialized size >>>>>>>>>> of the data over the wire. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Fri, Jan 31, 2014 at 2:49 PM, Stephen Sprague < >>>>>>>>>> sprag...@gmail.com> wrote: >>>>>>>>>> >>>>>>>>>>> Excellent progress David. So. What the most important thing >>>>>>>>>>> here we learned was that it works (!) by running hive in local mode >>>>>>>>>>> and >>>>>>>>>>> that this error is a limitation in the HiveServer2. That's >>>>>>>>>>> important. >>>>>>>>>>> >>>>>>>>>>> so textfile storage handler and having issues converting it to >>>>>>>>>>> ORC. hmmm. >>>>>>>>>>> >>>>>>>>>>> follow-ups. >>>>>>>>>>> >>>>>>>>>>> 1. what is your query that fails? >>>>>>>>>>> >>>>>>>>>>> 2. can you add a "limit 1" to the end of your query and tell us >>>>>>>>>>> if that works? this'll tell us if it's column or row bound. >>>>>>>>>>> >>>>>>>>>>> 3. bonus points. run these in local mode: >>>>>>>>>>> > set hive.exec.compress.output=true; >>>>>>>>>>> > set mapred.output.compression.type=BLOCK; >>>>>>>>>>> > set >>>>>>>>>>> mapred.output.compression.codec=org.apache.hadoop.io.compress.GzipCodec; >>>>>>>>>>> > create table blah stored as ORC as select * from <your >>>>>>>>>>> table>; #i'm curious if this'll work. >>>>>>>>>>> > show create table blah; #send output back if previous >>>>>>>>>>> step worked. >>>>>>>>>>> >>>>>>>>>>> 4. extra bonus. change ORC to SEQUENCEFILE in #3 see if that >>>>>>>>>>> works any differently. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> I'm wondering if compression would have any effect on the size >>>>>>>>>>> of the internal ArrayList the thrift server uses. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Fri, Jan 31, 2014 at 9:21 AM, David Gayou < >>>>>>>>>>> david.ga...@kxen.com> wrote: >>>>>>>>>>> >>>>>>>>>>>> Ok, so here are some news : >>>>>>>>>>>> >>>>>>>>>>>> I tried to boost the HADOOP_HEAPSIZE to 8192, >>>>>>>>>>>> I also setted the mapred.child.java.opts to 512M >>>>>>>>>>>> >>>>>>>>>>>> And it doesn't seem's to have any effect. >>>>>>>>>>>> ------ >>>>>>>>>>>> >>>>>>>>>>>> I tried it using an ODBC driver => fail after few minutes. >>>>>>>>>>>> Using a local JDBC (beeline) => running forever without any >>>>>>>>>>>> error. >>>>>>>>>>>> >>>>>>>>>>>> Both through hiveserver 2 >>>>>>>>>>>> >>>>>>>>>>>> If i use the local mode : it works! (but that not really what >>>>>>>>>>>> i need, as i don't really how to access it with my software) >>>>>>>>>>>> >>>>>>>>>>>> ------ >>>>>>>>>>>> I use a text file as storage. >>>>>>>>>>>> I tried to use ORC, but i can't populate it with a load data >>>>>>>>>>>> (it return an error of file format). >>>>>>>>>>>> >>>>>>>>>>>> Using an "ALTER TABLE orange_large_train_3 SET FILEFORMAT ORC" >>>>>>>>>>>> after populating the table, i have a file format error on select. >>>>>>>>>>>> >>>>>>>>>>>> ------ >>>>>>>>>>>> >>>>>>>>>>>> @Edward : >>>>>>>>>>>> >>>>>>>>>>>> I've tried to look around on how i can change the thrift heap >>>>>>>>>>>> size but haven't found anything. >>>>>>>>>>>> Same thing for my client (haven't found how to change the heap >>>>>>>>>>>> size) >>>>>>>>>>>> >>>>>>>>>>>> My usecase is really to have the most possible columns. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Thanks a lot for your help >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Regards >>>>>>>>>>>> >>>>>>>>>>>> David >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On Fri, Jan 31, 2014 at 1:12 AM, Edward Capriolo < >>>>>>>>>>>> edlinuxg...@gmail.com> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Ok here are the problem(s). Thrift has frame size limits, >>>>>>>>>>>>> thrift has to buffer rows into memory. >>>>>>>>>>>>> >>>>>>>>>>>>> Hove thrift has a heap size, it needs to big in this case. >>>>>>>>>>>>> >>>>>>>>>>>>> Your client needs a big heap size as well. >>>>>>>>>>>>> >>>>>>>>>>>>> The way to do this query if it is possible may be turning row >>>>>>>>>>>>> lateral, potwntially by treating it as a list, it will make >>>>>>>>>>>>> queries on it >>>>>>>>>>>>> awkward. >>>>>>>>>>>>> >>>>>>>>>>>>> Good luck >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On Thursday, January 30, 2014, Stephen Sprague < >>>>>>>>>>>>> sprag...@gmail.com> wrote: >>>>>>>>>>>>> > oh. thinking some more about this i forgot to ask some other >>>>>>>>>>>>> basic questions. >>>>>>>>>>>>> > >>>>>>>>>>>>> > a) what storage format are you using for the table (text, >>>>>>>>>>>>> sequence, rcfile, orc or custom)? "show create table <table>" >>>>>>>>>>>>> would yield >>>>>>>>>>>>> that. >>>>>>>>>>>>> > >>>>>>>>>>>>> > b) what command is causing the stack trace? >>>>>>>>>>>>> > >>>>>>>>>>>>> > my thinking here is rcfile and orc are column based (i >>>>>>>>>>>>> think) and if you don't select all the columns that could very >>>>>>>>>>>>> well limit >>>>>>>>>>>>> the size of the "row" being returned and hence the size of the >>>>>>>>>>>>> internal >>>>>>>>>>>>> ArrayList. OTOH, if you're using "select *", um, you have my >>>>>>>>>>>>> sympathies. :) >>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>>> > On Thu, Jan 30, 2014 at 11:33 AM, Stephen Sprague < >>>>>>>>>>>>> sprag...@gmail.com> wrote: >>>>>>>>>>>>> > >>>>>>>>>>>>> > thanks for the information. Up-to-date hive. Cluster on the >>>>>>>>>>>>> smallish side. And, well, sure looks like a memory issue. :) >>>>>>>>>>>>> rather than >>>>>>>>>>>>> an inherent hive limitation that is. >>>>>>>>>>>>> > >>>>>>>>>>>>> > So. I can only speak as a user (ie. not a hive developer) >>>>>>>>>>>>> but what i'd be interested in knowing next is is this via running >>>>>>>>>>>>> hive in >>>>>>>>>>>>> local mode, correct? (eg. not through hiveserver1/2). And it >>>>>>>>>>>>> looks like it >>>>>>>>>>>>> boinks on array processing which i assume to be internal code >>>>>>>>>>>>> arrays and >>>>>>>>>>>>> not hive data arrays - your 15K columns are all scalar/simple >>>>>>>>>>>>> types, >>>>>>>>>>>>> correct? Its clearly fetching results and looks be trying to >>>>>>>>>>>>> store them in >>>>>>>>>>>>> a java array - and not just one row but a *set* of rows >>>>>>>>>>>>> (ArrayList) >>>>>>>>>>>>> > >>>>>>>>>>>>> > two things to try. >>>>>>>>>>>>> > >>>>>>>>>>>>> > 1. boost the heap-size. try 8192. And I don't know if >>>>>>>>>>>>> HADOOP_HEAPSIZE is the controller of that. I woulda hoped it was >>>>>>>>>>>>> called >>>>>>>>>>>>> something like "HIVE_HEAPSIZE". :) Anyway, can't hurt to try. >>>>>>>>>>>>> > >>>>>>>>>>>>> > 2. trim down the number of columns and see where the >>>>>>>>>>>>> breaking point is. is it 10K? is it 5K? The idea is to confirm >>>>>>>>>>>>> its _the >>>>>>>>>>>>> number of columns_ that is causing the memory to blow and not >>>>>>>>>>>>> some other >>>>>>>>>>>>> artifact unbeknownst to us. >>>>>>>>>>>>> > >>>>>>>>>>>>> > 3. Google around the Hive namespace for something that might >>>>>>>>>>>>> limit or otherwise control the number of rows stored at once in >>>>>>>>>>>>> Hive's >>>>>>>>>>>>> internal buffer. I snoop around too. >>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>>> > That's all i got for now and maybe we'll get lucky and >>>>>>>>>>>>> someone on this list will know something or another about this. :) >>>>>>>>>>>>> > >>>>>>>>>>>>> > cheers, >>>>>>>>>>>>> > Stephen. >>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>>> > On Thu, Jan 30, 2014 at 2:32 AM, David Gayou < >>>>>>>>>>>>> david.ga...@kxen.com> wrote: >>>>>>>>>>>>> > >>>>>>>>>>>>> > We are using the Hive 0.12.0, but it doesn't work better on >>>>>>>>>>>>> hive 0.11.0 or hive 0.10.0 >>>>>>>>>>>>> > Our hadoop version is 1.1.2. >>>>>>>>>>>>> > Our cluster is 1 master + 4 slaves with 1 dual core xeon CPU >>>>>>>>>>>>> (with hyperthreading so 4 cores per machine) + 16Gb Ram each >>>>>>>>>>>>> > >>>>>>>>>>>>> > The error message i get is : >>>>>>>>>>>>> > >>>>>>>>>>>>> > 2014-01-29 12:41:09,086 ERROR thrift.ProcessFunction >>>>>>>>>>>>> (ProcessFunction.java:process(41)) - Internal error processing >>>>>>>>>>>>> FetchResults >>>>>>>>>>>>> > java.lang.OutOfMemoryError: Java heap space >>>>>>>>>>>>> > at java.util.Arrays.copyOf(Arrays.java:2734) >>>>>>>>>>>>> > at >>>>>>>>>>>>> java.util.ArrayList.ensureCapacity(ArrayList.java:167) >>>>>>>>>>>>> > at java.util.ArrayList.add(ArrayList.java:351) >>>>>>>>>>>>> > at >>>>>>>>>>>>> org.apache.hive.service.cli.Row.<init>(Row.java:47) >>>>>>>>>>>>> > at >>>>>>>>>>>>> org.apache.hive.service.cli.RowSet.addRow(RowSet.java:61) >>>>>>>>>>>>> > at >>>>>>>>>>>>> org.apache.hive.service.cli.operation.SQLOperation.getNextRowSet(SQLOperation.java:235) >>>>>>>>>>>>> > at >>>>>>>>>>>>> org.apache.hive.service.cli.operation.OperationManager.getOperationNextRowSet(OperationManager.java:170) >>>>>>>>>>>>> > at >>>>>>>>>>>>> org.apache.hive.service.cli.session.HiveSessionImpl.fetchResults(HiveSessionImpl.java:417) >>>>>>>>>>>>> > at >>>>>>>>>>>>> org.apache.hive.service.cli.CLIService.fetchResults(CLIService.java:306) >>>>>>>>>>>>> > at >>>>>>>>>>>>> org.apache.hive.service.cli.thrift.ThriftCLIService.FetchResults(ThriftCLIService.java:386) >>>>>>>>>>>>> > at >>>>>>>>>>>>> org.apache.hive.service.cli.thrift.TCLIService$Processor$FetchResults.getResult(TCLIService.java:1373) >>>>>>>>>>>>> > at >>>>>>>>>>>>> org.apache.hive.service.cli.thrift.TCLIService$Processor$FetchResults.getResult(TCLIService.java:1358) >>>>>>>>>>>>> > at >>>>>>>>>>>>> org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) >>>>>>>>>>>>> > at >>>>>>>>>>>>> org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) >>>>>>>>>>>>> > at >>>>>>>>>>>>> org.apache.hive.service.auth.TUGIContainingProcessor$1.run(TUGIContainingProcessor.java:58) >>>>>>>>>>>>> > at >>>>>>>>>>>>> org.apache.hive.service.auth.TUGIContainingProcessor$1.run(TUGIContainingProcessor.java:55) >>>>>>>>>>>>> > at java.security.AccessCont >>>>>>>>>>>>> >>>>>>>>>>>>> -- >>>>>>>>>>>>> Sorry this was sent from mobile. Will do less grammar and >>>>>>>>>>>>> spell check than usual. >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >>