[
https://issues.apache.org/jira/browse/PHOENIX-34?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13900643#comment-13900643
]
Maryann Xue commented on PHOENIX-34:
------------------------------------
Tried with RHS tables with 250K, 500K, 1M, 1.5M number of rows.
Logged the estimated size as 52M, 106M, 212M, 320M, so very close to my
previous estimation according to the formula.
All the above RHS tables worked fine on 1 RS with 1GiB total heap (default 50%
for Phoenix), without gettting InsufficientMemoryException. But when RHS size
reached 2M (should be roughly of 400M in size) number of rows, the Region
Server just crashed.
Looks like the above behaviors are as expected. And I followed the information
in that link and created the LHS table as 5M rows with 16 salt buckets and RHS
table with 4 salt buckets.
I suspect that the reason why Mujtaba was getting different result was due to
the test driver he used. Coz I found a problem in PhoenixRuntime, which didn't
not seem to have called "Resultset.close()" and hence did not release the
memory assigned for hash cache. So I guess that's why you got
InsufficientMemoryException even though the memory required for each table was
quite far from the upper limit.
> Insufficient memory exception on join when RHS rows count > 250K
> -----------------------------------------------------------------
>
> Key: PHOENIX-34
> URL: https://issues.apache.org/jira/browse/PHOENIX-34
> Project: Phoenix
> Issue Type: Bug
> Affects Versions: 3.0.0
> Environment: HBase 0.94.14, r1543222, Hadoop 1.0.4, r1393290, 2 RS +
> 1 Master, Heap 4GB per RS
> Reporter: Mujtaba Chohan
> Fix For: 3.0.0
>
>
> Join fails when rows count of RHS table is >250K. Detail on table schema is
> and performance numbers with different LHS/RHS row count is on
> http://phoenix-bin.github.io/client/performance/phoenix-20140210023154.htm.
> James comment:
> So that's with a 4GB heap allowing Phoenix to use 50% of it. With a pretty
> narrow table: 3 KV columns of 30bytes. Topping out at 250K is a bit low. I
> wonder if our memory estimation matches reality.
> What do you think Maryann?
> How about filing a JIRA, Mujtaba. This is a good conversation to have on the
> dev list. Can we move it there, please?
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)