[jira] [Commented] (FLUME-1248) flume-ng script gets broken when it tried to load hbase classpath

Will McQueen (JIRA) Thu, 31 May 2012 22:23:28 -0700

    [ 
https://issues.apache.org/jira/browse/FLUME-1248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13287156#comment-13287156
 ]


Will McQueen commented on FLUME-1248:
-------------------------------------

Hi Mingie and Hari,

>>Another option is to pass a parameter to flume-ng which indicates to include 
>>hadoop/hbase classpath or not.

AFAIK, hbase jars are only needed by hbase sink, and hadoop jars are needed 
only by FileChannel and HBaseEventSink. If we enable the passing of a param to 
flume-ng that indicates whether to include hadoop/hbase classpath, then what 
about the case where a user starts the agent without any of those 3 components 
in the config file, but later decides to modify the config while Flume is 
running? In that case, I believe a ClassNotFoundException would be thrown when 
adding one of those 3 components to the config (after a dynamic refresh is 
triggered w/in 30 secs of modification).

For this reason, my vote is currently to pick-up hbase and hadoop jars.

Also, I'm wondering about the solution regarding the grep of the hbase (or 
hadoop) string... it seems that 'hbase' string won't necessarily be in the 
classpath if a user installs hbase from a tar file. So I'm concerned about 
having our parsing code be tightly coupled to the 'hbase' string. Can we 
explore other alternatives? I think some things to consider in general would be:

1) Keep in mind that the goal of these 2 particular invocations of hbase and 
hadoop scripts are only to get the java.library.path, which I believe should be 
unaffected by an JVM options. ie, I don't see a need to profile these 2 
invocations of hadoop and hbase like I would if I were profiling the actual 
hbase and hadoop daemons. Do you?

2) There are several ways that JVM options might sneak into the environment 
when hbase/hadoop is run. One way might be through a file in /etc/default that 
could get sourced by a hadoop/hbase init script. Another way could be through 
flume-env.sh. Another might be just by exporting a var from the parent shell 
that runs flume-ng. We should consider all these cases (and possibly more), and 
see if it makes sense to somehow override all JVM options passed to the 
hadoop/hbase during these 2 invocations so that the JVM options are reset to 
empty string to guarantee clean output.

3) We could modify the GetJavaProperty tool (or create an alternative impl and 
leave the current impl as it is). The reason is this: I'm concerned that it 
could be possible that thedesired output we need from this tool might be 
interleaved with output from some JVM option, possibly resulting in corrupted 
output. So, one proposal would be to insert some checks into the output string 
that the flume-ng script could use to confirm the java.library.path value's 
validity. For example, if GetJavaProperty returns a java.library.path value of 
".:/some/class/path", then the output to stdout could be something like 
"<magicnum>.:/some/class/path<magicnum><md5sum>", where magicnum is some known 
constant specified in GetJavaProperty, and surrounds the java.library.path 
value, with a trailing md5sum of the ".:/some/class.path" value to ensure that 
the string value was not interleaved with some other output from some JVM 
options. So the flume-ng script would just need to use the known magicnum to 
parse-out the value using something like a regex, and then parse-out the md5sum 
trailer (which I believe is a fixed string) to compare it against the md5sum 
calculated by the flume-ng script. If they match, we continue. If they don't, 
then we either print a warning and continue, or we print an error and stop (I 
vote for error and stop).

Personally, I feel most comfortable with option #3 so far as the more robust of 
the 3. Thoughts?

Cheers,
Will
                
> flume-ng script gets broken when it tried to load hbase classpath
> -----------------------------------------------------------------
>
>                 Key: FLUME-1248
>                 URL: https://issues.apache.org/jira/browse/FLUME-1248
>             Project: Flume
>          Issue Type: Bug
>          Components: Shell
>    Affects Versions: v1.1.0
>            Reporter: Mingjie Lai
>            Assignee: Hari Shreedharan
>             Fix For: v1.2.0
>
>
> bin/flume-ng tried to load hbase/hadoop class path by this:
> {code}
> 103     local HBASE_CLASSPATH=""
> 104     local HBASE_JAVA_LIBRARY_PATH=$(HBASE_CLASSPATH="$FLUME_CLASSPATH" \
> 105         ${HBASE_IN_PATH} org.apache.flume.tools.GetJavaProperty \
> 106         java.library.path 2>/dev/null)
> {code}
> It actually turned out to be:
> {code}
> $ hbase -cp ../lib/flume-ng-core-1.2.0-incubating-SNAPSHOT.jar \
>   org.apache.flume.tools.GetJavaProperty  java.library.path
> {code}
> However what I saw is:
> {code}
> -bash-3.2$ hbase -cp ../lib/flume-ng-core-1.2.0-incubating-SNAPSHOT.jar   
> org.apache.flume.tools.GetJavaProperty  java.library.path
> /usr/lib/hadoop-0.20/lib/native/Linux-amd64-64:/usr/lib/hbase/bin/../lib/native/Linux-amd64-64
> Heap
>  par new generation   total 235968K, used 8391K [0x00000002fae00000, 
> 0x000000030ae00000, 0x000000030ae00000)
>   eden space 209792K,   4% used [0x00000002fae00000, 0x00000002fb631f30, 
> 0x0000000307ae0000)
>   from space 26176K,   0% used [0x0000000307ae0000, 0x0000000307ae0000, 
> 0x0000000309470000)
>   to   space 26176K,   0% used [0x0000000309470000, 0x0000000309470000, 
> 0x000000030ae00000)
>  concurrent mark-sweep generation total 20709376K, used 0K 
> [0x000000030ae00000, 0x00000007fae00000, 0x00000007fae00000)
>  concurrent-mark-sweep perm gen total 21248K, used 2724K [0x00000007fae00000, 
> 0x00000007fc2c0000, 0x0000000800000000)
> {code}
> The hbase gc info outputs to stdout and screwed up the flume-ng script. 
> The root cause is the combination of several factors:
> 1. turn on hbase gc log by:
> {code}
> export HBASE_OPTS="$HBASE_OPTS -verbose:gc -XX:+PrintGCDetails 
> -XX:+PrintGCDateS
> tamps -Xloggc:$HBASE_HOME/logs/gc-hbase.log" 
> {code}
> 2. the gc log directory is protected by limiting the permission as 755, and 
> owned by hbase user.
> 3. use another user, such as flume, to execute the script.
> Since flume user doesn't have write permission to the hbase gc log directory, 
> jvm will output the gc info to stdout, and the flume script will be screwed 
> up. 
> A simple but tricky fix could be adding ``grep hbase'' in the scrip to filter 
> out the gc info:
> {code}
> 103     local HBASE_CLASSPATH=""
> 104     local HBASE_JAVA_LIBRARY_PATH=$(HBASE_CLASSPATH="$FLUME_CLASSPATH" \
> 105         ${HBASE_IN_PATH} org.apache.flume.tools.GetJavaProperty \
> 106         java.library.path | grep hbase 2>/dev/null)
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (FLUME-1248) flume-ng script gets broken when it tried to load hbase classpath

Reply via email to