[jira] [Commented] (DRILL-4573) Zero copy LIKE, REGEXP_MATCHES, SUBSTR

2016-05-30 Thread Jinfeng Ni (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15307230#comment-15307230
 ] 

Jinfeng Ni commented on DRILL-4573:
---

If relying on a configuration to turn on or off the additional check means user 
has to set the option on /off, this seems not a reasonable approach.  A regular 
user may or may not know whether the data has ASCII only or not; Drill should 
not force user to remember to set the option, in order to get this 7% 
performance difference. 

How do you check if the input has ASCII only? I thought it could be done by 
simply checking the # of chars == # of bytes. 

For the LIKE option you talked about, you may consider open a separate JIRA to 
deliver the fix.  For this one, let's focus on getting the incorrect issue 
fixed. We have to fix the incorrect issue in the next release, since incorrect 
result is a critical bug. 


> Zero copy LIKE, REGEXP_MATCHES, SUBSTR
> --
>
> Key: DRILL-4573
> URL: https://issues.apache.org/jira/browse/DRILL-4573
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: jean-claude
>Priority: Critical
> Fix For: 1.7.0
>
> Attachments: DRILL-4573-3.patch.txt, DRILL-4573.patch.txt
>
>
> All the functions using the java.util.regex.Matcher are currently creating 
> Java string objects to pass into the matcher.reset().
> However this creates unnecessary copy of the bytes and a Java string object.
> The matcher uses a CharSequence, so instead of making a copy we can create an 
> adapter from the DrillBuffer to the CharSequence interface.
> Gains of 25% in execution speed are possible when going over VARCHAR of 36 
> chars. The gain will be proportional to the size of the VARCHAR.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-4573) Zero copy LIKE, REGEXP_MATCHES, SUBSTR

2016-05-30 Thread Jinfeng Ni (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jinfeng Ni updated DRILL-4573:
--
Priority: Critical  (was: Minor)

> Zero copy LIKE, REGEXP_MATCHES, SUBSTR
> --
>
> Key: DRILL-4573
> URL: https://issues.apache.org/jira/browse/DRILL-4573
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: jean-claude
>Priority: Critical
> Fix For: 1.7.0
>
> Attachments: DRILL-4573-3.patch.txt, DRILL-4573.patch.txt
>
>
> All the functions using the java.util.regex.Matcher are currently creating 
> Java string objects to pass into the matcher.reset().
> However this creates unnecessary copy of the bytes and a Java string object.
> The matcher uses a CharSequence, so instead of making a copy we can create an 
> adapter from the DrillBuffer to the CharSequence interface.
> Gains of 25% in execution speed are possible when going over VARCHAR of 36 
> chars. The gain will be proportional to the size of the VARCHAR.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4581) Various problems in the Drill startup scripts

2016-05-30 Thread Paul Rogers (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15306972#comment-15306972
 ] 

Paul Rogers commented on DRILL-4581:


Another issue found by John O., who was trying to enable GC logging.

Drill-env.sh has the following:

export SERVER_GC_OPTS="-XX:+CMSClassUnloadingEnabled -XX:+UseG1GC "

Presumably, to enable logging, users add teh following

export SERVER_GC_OPTS="$SERVER_GC_OPTS -Xloggc:"

Which is supposed to be replaced with teh actual log file path in drillbit.sh:

export SERVER_GC_OPTS=${SERVER_GC_OPTS/"-Xloggc:"/"-Xloggc:${loggc}"}

The substitution here works only if SERVER_GC_OPTS contains ONLY the -Xloggc 
stuff, but fails when it contains the extra material.

> Various problems in the Drill startup scripts
> -
>
> Key: DRILL-4581
> URL: https://issues.apache.org/jira/browse/DRILL-4581
> Project: Apache Drill
>  Issue Type: Sub-task
>  Components:  Server
>Affects Versions: 1.6.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
>Priority: Minor
>
> Noticed the following in drillbit.sh:
> 1) Comment: DRILL_LOG_DIRWhere log files are stored.  PWD by default.
> Code: DRILL_LOG_DIR=/var/log/drill or, if it does not exist, $DRILL_HOME/log
> 2) Comment: DRILL_PID_DIRThe pid files are stored. /tmp by default.
> Code: DRILL_PID_DIR=$DRILL_HOME
> 3) Redundant checking of JAVA_HOME. drillbit.sh sources drill-config.sh which 
> checks JAVA_HOME. Later, drillbit.sh checks it again. The second check is 
> both unnecessary and prints a less informative message than the 
> drill-config.sh check. Suggestion: Remove the JAVA_HOME check in drillbit.sh.
> 4) Though drill-config.sh carefully checks JAVA_HOME, it does not export the 
> JAVA_HOME variable. Perhaps this is why drillbit.sh repeats the check? 
> Recommended: export JAVA_HOME from drill-config.sh.
> 5) Both drillbit.sh and the sourced drill-config.sh check DRILL_LOG_DIR and 
> set the default value. Drill-config.sh defaults to /var/log/drill, or if that 
> fails, to $DRILL_HOME/log. Drillbit.sh just sets /var/log/drill and does not 
> handle the case where that directory is not writable. Suggested: remove the 
> check in drillbit.sh.
> 6) Drill-config.sh checks the writability of the DRILL_LOG_DIR by touching 
> sqlline.log, but does not delete that file, leaving a bogus, empty client log 
> file on the drillbit server. Recommendation: use bash commands instead.
> 7) The implementation of the above check is a bit awkward. It has a fallback 
> case with somewhat awkward logic. Clean this up.
> 8) drillbit.sh, but not drill-config.sh, attempts to create /var/log/drill if 
> it does not exist. Recommended: decide on a single choice, implement it in 
> drill-config.sh.
> 9) drill-config.sh checks if $DRILL_CONF_DIR is a directory. If not, defaults 
> it to $DRILL_HOME/conf. This can lead to subtle errors. If I use
> drillbit.sh --config /misspelled/path
> where I mistype the path, I won't get an error, I get the default config, 
> which may not at all be what I want to run. Recommendation: if the value of 
> DRILL_CONF_DRILL is passed into the script (as a variable or via --config), 
> then that directory must exist. Else, use the default.
> 10) drill-config.sh exports, but may not set, HADOOP_HOME. This may be left 
> over from the original Hadoop script that the Drill script was based upon. 
> Recomendation: export only in the case that HADOOP_HOME is set for cygwin.
> 11) Drill-config.sh checks JAVA_HOME and prints a big, bold error message to 
> stderr if JAVA_HOME is not set. Then, it checks the Java version and prints a 
> different message (to stdout) if the version is wrong. Recommendation: use 
> the same format (and stderr) for both.
> 12) Similarly, other Java checks later in the script produce messages to 
> stdout, not stderr.
> 13) Drill-config.sh searches $JAVA_HOME to find java/java.exe and verifies 
> that it is executable. The script then throws away what we just found. Then, 
> drill-bit.sh tries to recreate this information as:
> JAVA=$JAVA_HOME/bin/java
> This is wrong in two ways: 1) it ignores the actual java location and assumes 
> it, and 2) it does not handle the java.exe case that drill-config.sh 
> carefully worked out.
> Recommendation: export JAVA from drill-config.sh and remove the above line 
> from drillbit.sh.
> 14) drillbit.sh presumably takes extra arguments like this:
> drillbit.sh -Dvar0=value0 --config /my/conf/dir start -Dvar1=value1 
> -Dvar2=value2 -Dvar3=value3
> The -D bit allows the user to override config variables at the command line. 
> But, the scripts don't use the values.
> A) drill-config.sh consumes --config /my/conf/dir after consuming the leading 
> arguments:
> while [ $# -gt 1 ]; do
>   if [ "--config" = "$1" ]; then
> shift
> 

[jira] [Updated] (DRILL-4571) Add link to local Drill logs from the web UI

2016-05-30 Thread Arina Ielchiieva (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arina Ielchiieva updated DRILL-4571:

Attachment: drillbit_queries_json_screenshot.jpg

> Add link to local Drill logs from the web UI
> 
>
> Key: DRILL-4571
> URL: https://issues.apache.org/jira/browse/DRILL-4571
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: 1.6.0
>Reporter: Arina Ielchiieva
>Assignee: Arina Ielchiieva
>  Labels: doc-impacting
> Fix For: 1.7.0
>
> Attachments: display_log.JPG, drillbit_download.log.gz, 
> drillbit_queries_json_screenshot.jpg, drillbit_ui.log, log_list.JPG
>
>
> Now we have link to the profile from the web UI.
> It will be handy for the users to have the link to local logs as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4571) Add link to local Drill logs from the web UI

2016-05-30 Thread Arina Ielchiieva (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15306529#comment-15306529
 ] 

Arina Ielchiieva commented on DRILL-4571:
-

[~knguyen],
thanks for verification.
I have fixed point 1 and 3.
Point 2 regarding the content of the drillbit_queries.json display in Chrome is 
not reproducible. 
Actually, before my changes drillbit_queries.json log was writing all info in 
file in one line. I have modified logback.xml so query log started to write and 
thus and display in more readable format. So if you had drillbit_queries.json 
with previous info, it might show you all in one line. But after my changes all 
info should be written in file line by line.
Attaching screenshot of drillbit_queries.json display 
(drillbit_queries_json_screenshot.jpg). Could you please re-check?
All changes are in "DRILL-4571-fix" branch in my repo. 
(https://github.com/arina-ielchiieva/drill/commit/7def77b99d61a1c0e4810ba2a073bbc770d2d161).

> Add link to local Drill logs from the web UI
> 
>
> Key: DRILL-4571
> URL: https://issues.apache.org/jira/browse/DRILL-4571
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: 1.6.0
>Reporter: Arina Ielchiieva
>Assignee: Arina Ielchiieva
>  Labels: doc-impacting
> Fix For: 1.7.0
>
> Attachments: display_log.JPG, drillbit_download.log.gz, 
> drillbit_ui.log, log_list.JPG
>
>
> Now we have link to the profile from the web UI.
> It will be handy for the users to have the link to local logs as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)