[
https://issues.apache.org/jira/browse/TAJO-548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13894821#comment-13894821
]
Hyunsik Choi commented on TAJO-548:
-----------------------------------
I'm very sorry for the late review. I uploaded the rebased patch.
I checked the source code. As Min mentioned, that source code seems to be
derived from Bytes.parseInt used in Hive. So, I believe that it does not
license problem.
The solution definitely reduces the creation of many objects. We have wanted
this approach. I throw +1 for this patch.
Aside from this patch, if some source code is derived from JDK (GPL v2 with
classpath exception), the source cannot be included in Apache License Software
as far as I know. See the this page
(http://www.apache.org/legal/resolved.html#category-x).
Thank you for your contribution!
> Investigate frequent young gc
> ------------------------------
>
> Key: TAJO-548
> URL: https://issues.apache.org/jira/browse/TAJO-548
> Project: Tajo
> Issue Type: Improvement
> Affects Versions: 0.8-incubating
> Reporter: Min Zhou
> Assignee: Min Zhou
> Fix For: 0.8-incubating
>
> Attachments: TAJO-548-v2.patch, TAJO-548.patch
>
>
> Two days ago, I ran a 300X TPCH Q1 on a 7 slaves , 1 master tajo clsuter.
> Here is my configuration for each worker jvm
> {noformat}
> export TAJO_WORKER_OPTS="-Xmx20g -Xms20g -XX:MaxPermSize=512m -verbose:gc
> -Xloggc:$TAJO_LOG_DIR/worker-gc.log -XX:+UseConcMarkSweepGC
> -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+UseCompressedOops "
> {noformat}
> I observed very frequent young gc
> {noformat}
> $ jstat -gcutil `pgrep -f TajoWorker` 1000
> S0 S1 E O P YGC YGCT FGC FGCT GCT
> 0.00 2.13 57.40 3.79 99.65 5185 75.042 0 0.000 75.042
> 0.00 2.13 57.40 3.79 99.65 5185 75.042 0 0.000 75.042
> 0.00 2.13 57.41 3.79 99.65 5185 75.042 0 0.000 75.042
> 0.00 2.13 57.42 3.79 99.65 5185 75.042 0 0.000 75.042
> 0.00 2.13 57.42 3.79 99.65 5185 75.042 0 0.000 75.042
> 0.00 2.13 57.43 3.79 99.65 5185 75.042 0 0.000 75.042
> 0.00 2.13 57.43 3.79 99.65 5185 75.042 0 0.000 75.042
> 0.00 2.13 57.43 3.79 99.65 5185 75.042 0 0.000 75.042
> 0.00 2.13 57.43 3.79 99.65 5185 75.042 0 0.000 75.042
> 0.00 2.13 57.43 3.79 99.65 5185 75.042 0 0.000 75.042
> 0.00 2.13 57.44 3.79 99.65 5185 75.042 0 0.000 75.042
> 0.00 2.13 57.53 3.79 99.65 5185 75.042 0 0.000 75.042
> 0.00 2.13 57.57 3.79 99.65 5185 75.042 0 0.000 75.042
> 0.00 2.13 57.58 3.79 99.65 5185 75.042 0 0.000 75.042
> 0.00 2.13 57.58 3.79 99.65 5185 75.042 0 0.000 75.042
> 0.00 2.13 57.58 3.79 99.65 5185 75.042 0 0.000 75.042
> 0.00 2.13 57.59 3.79 99.65 5185 75.042 0 0.000 75.042
> 0.00 2.13 57.59 3.79 99.65 5185 75.042 0 0.000 75.042
> 0.00 2.13 57.60 3.79 99.65 5185 75.042 0 0.000 75.042
> 0.00 2.13 57.60 3.79 99.65 5185 75.042 0 0.000 75.042
> 0.00 2.13 66.73 3.79 99.66 5185 75.042 0 0.000 75.042
> 0.00 66.46 17.89 3.79 99.66 5190 75.082 0 0.000 75.082
> 8.79 0.00 37.36 3.88 99.66 5202 75.533 0 0.000 75.533
> 0.00 9.10 21.61 3.88 99.66 5216 75.926 0 0.000 75.926
> ...
> 105.523
> {noformat}
> After finish the Q1 query, GCT is at 105.523, which means young gc cost 30
> seconds. As well known, ParNew YGC is a kind of stop the world gc. The
> whole query need about 98 secs. So in this case, young gc waste nearly 1
> third of the query time.
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)