Are u observing very high memory usage? The session state should be available 
for cleanup once the connection is closed. So really should only scale to 
number of active connections - at this point that would be a _great_ problem to 
have :) (since no one's even really used the thrift server for real !)

________________________________
From: Min Zhou [mailto:coderp...@gmail.com]
Sent: Monday, March 09, 2009 11:39 PM
To: hive-user@hadoop.apache.org
Subject: Re: thread cofinement session state

oops, my fault. It works now.
However, the overhead initializing a new session state is too high. I guess it 
will cause a StackOverflowError when connection reaching a certain amount.
On Tue, Mar 10, 2009 at 2:16 PM, Min Zhou 
<coderp...@gmail.com<mailto:coderp...@gmail.com>> wrote:
No connection right now,  the server can not start well.

On Tue, Mar 10, 2009 at 2:07 PM, Joydeep Sen Sarma 
<jssa...@facebook.com<mailto:jssa...@facebook.com>> wrote:

Hey - not able to understand - does this mean it didn't work. Can u explain in 
more detail what u did (how many connections/requests etc.)



________________________________

From: Min Zhou [mailto:coderp...@gmail.com<mailto:coderp...@gmail.com>]
Sent: Monday, March 09, 2009 11:04 PM

To: hive-user@hadoop.apache.org<mailto:hive-user@hadoop.apache.org>
Subject: Re: thread cofinement session state



The server  was keeping stay at the start point.

On Tue, Mar 10, 2009 at 1:36 PM, Joydeep Sen Sarma 
<jssa...@facebook.com<mailto:jssa...@facebook.com>> wrote:

Attaching a small patch. Can u try and see if this works? (it compiles and 
passes the hiveserver test)



It does seem that the getProcessor() call is made every time a new connection 
starts to get serviced (so, yes, after the accept call).



________________________________

From: Min Zhou [mailto:coderp...@gmail.com<mailto:coderp...@gmail.com>]
Sent: Monday, March 09, 2009 10:22 PM

To: hive-user@hadoop.apache.org<mailto:hive-user@hadoop.apache.org>
Subject: Re: thread cofinement session state



Hi,

On Tue, Mar 10, 2009 at 12:07 PM, Prasad Chakka 
<pra...@facebook.com<mailto:pra...@facebook.com>> wrote:

I use a thread specific storage to do something similar. Ie. I keep the 
initialized session in the tss so new threads will not have it initialized. 
Would that work here?

I am sorry you misunderstanded  our meaning, we are talking about new threads 
with a new session state should be initialized.



The Thrift Interface handler is constructed just once for the lifetime of the 
HiveServer. The session object is initialized inside this constructor. The 
SessionState.start() is called only once (in the constructor)

All connections/requests then go through the same handler object - but when 
they run in worker threads - they don't have a current session object for that 
thread.

This the right thing I want to explain.



I think the solution here is to initialize TThreadPoolServer using a custom 
implementation of TProcessorFactory. The getProcessor() call can return a 
freshly constructed handler (with new SessionState etc.). This will work in all 
scenarios I think. (btw - the same thread can serve different connections - but 
from reading the code - a getProcessor() call will be made for every connection)

(see TThreadPoolServer.WorkerProcess and TThreadPoolServer.serve())

This idea will work if the TProcessorFactory could caught the event server has 
already accepted a client, I think.



________________________________

From: Prasad Chakka [mailto:pra...@facebook.com]
Sent: Monday, March 09, 2009 8:23 PM
To: hive-user@hadoop.apache.org<http://hive-user@hadoop.apache.org>
Subject: Re: thread cofinement session state

I am assuming he is using the same code as MetaStore server. AFAIK, 
TThreadPoolServer is supposed to use a new thread for each connection.

________________________________

From: Joydeep Sen Sarma <jssa...@facebook.com<http://jssa...@facebook.com>>
Reply-To: <hive-user@hadoop.apache.org<http://hive-user@hadoop.apache.org>>
Date: Mon, 9 Mar 2009 20:16:22 -0700
To: <hive-user@hadoop.apache.org<http://hive-user@hadoop.apache.org>>
Subject: RE: thread cofinement session state

(also been reading up on this code a bit just now)

That's weird. It seems to be using TThreadPoolServer and that seems to just 
service all requests from a single connection in one thread. (and uses the same 
processor I assume that seems to initialize the session state in the interface 
constructor)

Are ur execute calls happening on the same connection?

________________________________

From: Min Zhou [mailto:coderp...@gmail.com]
Sent: Monday, March 09, 2009 8:03 PM
To: hive-user@hadoop.apache.org<http://hive-user@hadoop.apache.org>
Subject: thread cofinement session state

Hi list,
   I found each invoke of HiveServer's execute method run in different threads. 
Those threads which execute a HSQL query(not a client connection may execute 
several pieces of queries), have not their own session state. when I call 
SessionState.get(), it will return null beacuse session state on this thread 
hadnot been constructed before. see also fragment of ExecDriver.java:

  public static String getRealFiles(Configuration conf) {
    // fill in local files to be added to the task environment
    SessionState ss = SessionState.get();  // return ss will get null !!!
   ...
  }

Is it a bug?

Thanks,
Min
--
My research interests are distributed systems, parallel computing and bytecode 
based virtual machine.

http://coderplay.javaeye.com



--
My research interests are distributed systems, parallel computing and bytecode 
based virtual machine.

http://coderplay.javaeye.com



--
My research interests are distributed systems, parallel computing and bytecode 
based virtual machine.

http://coderplay.javaeye.com



--
My research interests are distributed systems, parallel computing and bytecode 
based virtual machine.

http://coderplay.javaeye.com



--
My research interests are distributed systems, parallel computing and bytecode 
based virtual machine.

http://coderplay.javaeye.com

Reply via email to