Are u observing very high memory usage? The session state should be available for cleanup once the connection is closed. So really should only scale to number of active connections - at this point that would be a _great_ problem to have :) (since no one's even really used the thrift server for real !)
________________________________ From: Min Zhou [mailto:coderp...@gmail.com] Sent: Monday, March 09, 2009 11:39 PM To: hive-user@hadoop.apache.org Subject: Re: thread cofinement session state oops, my fault. It works now. However, the overhead initializing a new session state is too high. I guess it will cause a StackOverflowError when connection reaching a certain amount. On Tue, Mar 10, 2009 at 2:16 PM, Min Zhou <coderp...@gmail.com<mailto:coderp...@gmail.com>> wrote: No connection right now, the server can not start well. On Tue, Mar 10, 2009 at 2:07 PM, Joydeep Sen Sarma <jssa...@facebook.com<mailto:jssa...@facebook.com>> wrote: Hey - not able to understand - does this mean it didn't work. Can u explain in more detail what u did (how many connections/requests etc.) ________________________________ From: Min Zhou [mailto:coderp...@gmail.com<mailto:coderp...@gmail.com>] Sent: Monday, March 09, 2009 11:04 PM To: hive-user@hadoop.apache.org<mailto:hive-user@hadoop.apache.org> Subject: Re: thread cofinement session state The server was keeping stay at the start point. On Tue, Mar 10, 2009 at 1:36 PM, Joydeep Sen Sarma <jssa...@facebook.com<mailto:jssa...@facebook.com>> wrote: Attaching a small patch. Can u try and see if this works? (it compiles and passes the hiveserver test) It does seem that the getProcessor() call is made every time a new connection starts to get serviced (so, yes, after the accept call). ________________________________ From: Min Zhou [mailto:coderp...@gmail.com<mailto:coderp...@gmail.com>] Sent: Monday, March 09, 2009 10:22 PM To: hive-user@hadoop.apache.org<mailto:hive-user@hadoop.apache.org> Subject: Re: thread cofinement session state Hi, On Tue, Mar 10, 2009 at 12:07 PM, Prasad Chakka <pra...@facebook.com<mailto:pra...@facebook.com>> wrote: I use a thread specific storage to do something similar. Ie. I keep the initialized session in the tss so new threads will not have it initialized. Would that work here? I am sorry you misunderstanded our meaning, we are talking about new threads with a new session state should be initialized. The Thrift Interface handler is constructed just once for the lifetime of the HiveServer. The session object is initialized inside this constructor. The SessionState.start() is called only once (in the constructor) All connections/requests then go through the same handler object - but when they run in worker threads - they don't have a current session object for that thread. This the right thing I want to explain. I think the solution here is to initialize TThreadPoolServer using a custom implementation of TProcessorFactory. The getProcessor() call can return a freshly constructed handler (with new SessionState etc.). This will work in all scenarios I think. (btw - the same thread can serve different connections - but from reading the code - a getProcessor() call will be made for every connection) (see TThreadPoolServer.WorkerProcess and TThreadPoolServer.serve()) This idea will work if the TProcessorFactory could caught the event server has already accepted a client, I think. ________________________________ From: Prasad Chakka [mailto:pra...@facebook.com] Sent: Monday, March 09, 2009 8:23 PM To: hive-user@hadoop.apache.org<http://hive-user@hadoop.apache.org> Subject: Re: thread cofinement session state I am assuming he is using the same code as MetaStore server. AFAIK, TThreadPoolServer is supposed to use a new thread for each connection. ________________________________ From: Joydeep Sen Sarma <jssa...@facebook.com<http://jssa...@facebook.com>> Reply-To: <hive-user@hadoop.apache.org<http://hive-user@hadoop.apache.org>> Date: Mon, 9 Mar 2009 20:16:22 -0700 To: <hive-user@hadoop.apache.org<http://hive-user@hadoop.apache.org>> Subject: RE: thread cofinement session state (also been reading up on this code a bit just now) That's weird. It seems to be using TThreadPoolServer and that seems to just service all requests from a single connection in one thread. (and uses the same processor I assume that seems to initialize the session state in the interface constructor) Are ur execute calls happening on the same connection? ________________________________ From: Min Zhou [mailto:coderp...@gmail.com] Sent: Monday, March 09, 2009 8:03 PM To: hive-user@hadoop.apache.org<http://hive-user@hadoop.apache.org> Subject: thread cofinement session state Hi list, I found each invoke of HiveServer's execute method run in different threads. Those threads which execute a HSQL query(not a client connection may execute several pieces of queries), have not their own session state. when I call SessionState.get(), it will return null beacuse session state on this thread hadnot been constructed before. see also fragment of ExecDriver.java: public static String getRealFiles(Configuration conf) { // fill in local files to be added to the task environment SessionState ss = SessionState.get(); // return ss will get null !!! ... } Is it a bug? Thanks, Min -- My research interests are distributed systems, parallel computing and bytecode based virtual machine. http://coderplay.javaeye.com -- My research interests are distributed systems, parallel computing and bytecode based virtual machine. http://coderplay.javaeye.com -- My research interests are distributed systems, parallel computing and bytecode based virtual machine. http://coderplay.javaeye.com -- My research interests are distributed systems, parallel computing and bytecode based virtual machine. http://coderplay.javaeye.com -- My research interests are distributed systems, parallel computing and bytecode based virtual machine. http://coderplay.javaeye.com