[GitHub] drill issue #914: DRILL-5657: Size-aware vector writer structure

2017-10-02 Thread paul-rogers
Github user paul-rogers commented on the issue:

https://github.com/apache/drill/pull/914
  
Rebased on to master with conflicts resolved.


---


[GitHub] drill pull request #969: DRILL-5802: Provide a sortable table for tables wit...

2017-10-02 Thread kkhatua
GitHub user kkhatua opened a pull request:

https://github.com/apache/drill/pull/969

DRILL-5802: Provide a sortable table for tables within a query profile

Using the DataTables jQuery library, we can now sort tables (fragments and 
operators) to group like values or sort on a column.
In addition, additional tooltips have been provided for these columns.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/kkhatua/drill DRILL-5802-Alt

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/969.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #969


commit fb39faa8736c790989c8bc2497cf608563e8096f
Author: Kunal Khatua 
Date:   2017-10-02T22:59:01Z

DRILL-5802: Provide a sortable table for tables within a query profile

Using the DataTables jQuery library, we can now sort tables (fragments and 
operators) to group like values or sort on a column.
In addition, additional tooltips have been provided for these columns




---


Re: Newbie: Help debugging Drill

2017-10-02 Thread Paul Rogers
Hi Matthew,

Debugging Drill is a bit of an art.

Your direction is fine. However, the source code you forked from Apache Drill 
has evolved significantly since the 1.11 build, so you’ll want to build your 
own Drill if you want to do remote debugging.

> mvn clean install -DskipTests

Will get you started.

You added remote debug options to the sqlline (client) script. This works only 
if you are running Drill embedded in Sqlline, which is one of two modes that 
sqlline can run. Be sure you are, in fact, running Drill embedded in this case.

You can also launch Drill as a server (change the DRILLBIT_JAVA_OPTS in 
drill-env.sh to include remote debugging). Then, attach the debugger, and 
lastly fire up sqlline to connect remotely.

Note that Drill is a highly multi-threaded server. You really need to be 
familiar with code structure to know exactly where to look. The CSV area is a 
bit complex, with many areas interacting, so it will take some sleuthing to 
track down the problem.

It is also possible to debug Drill directly from Eclipse, running embedded. See 
the history of this list fro some hints if you want to go down that path.

Thanks,

- Paul

> On Oct 2, 2017, at 2:19 PM, Matthew Mucker  wrote:
> 
> I became a new Drill user last week only to discover that Drill would crash
> with an IndexOutOfBounds exception on one of my queries. Some searching and
> testing later, my best guess is that I'm hitting DRILL-5451.
> 
> 
> 
> Since this is currently a showstopper for me, and since I might learn
> something by doing so, I thought I'd give it a go and try to debug this
> problem and see if I might be able to contribute back.
> 
> 
> 
> I'm finding that when it comes to debugging, I really don't know what I'm
> doing, and could use some help. 
> 
> 
> 
> Preferably, help made up of small words.
> 
> 
> 
> I'm running Drill 1.11.0 on Windows 7. To start Drill in debug mode, my best
> guess was to edit line 182 of sqlline.bat to read:
> 
> 
> 
> set DRILL_SHELL_JAVA_OPTS=%DRILL_SHELL_JAVA_OPTS%
> -Dlog.path="%DRILL_LOG_DIR%\sqlline.log"
> -Dlog.query.path="%DRILL_LOG_DIR%\sqlline_queries.log" -Xdebug
> -Xrunjdwp:transport=dt_socket,server=y,suspend=y,address=8000
> 
> 
> 
> When I run sqlline.bat -u "jdbc:drill:sz=local" now, I get a message
> "Listening for transport dt_socket at address: 8000" which, I expect, means
> that the runtime is waiting for the debugger to attach.
> 
> 
> 
> I downloaded the Drill source and built with Maven. Using Eclipse Neon, I
> imported the project. I created a debug configuration for a Remote Java
> Application for project drill-common, on machine localhost:8000.
> 
> 
> 
> When I start the debug session, sqlline finishes launching and I get a
> prompt at which I can enter a SQL command. Which suggests to me that the
> debugger is in fact attached.
> 
> 
> 
> Inside Eclipse, I set breakpoints on lines 114 and 122 of
> drill-java-exec/src/main/java/io.netty.buffer/DrillBuf.java.
> 
> 
> 
> However, when I repro my issue, I get the IndexOutOfBoundsException in
> sqlline, but there's no indication in Eclipse that the debugger has broken
> in, and I see no facilities to examine the stack trace or the local
> variables.
> 
> 
> 
> What do I do next?
> 



Re: Newbie: Help debugging Drill

2017-10-02 Thread Charles Givre
HI Matthew, 
Can you describe the data you are querying and the query you are trying to 
execute?
— C


> On Oct 2, 2017, at 17:19, Matthew Mucker  wrote:
> 
> I became a new Drill user last week only to discover that Drill would crash
> with an IndexOutOfBounds exception on one of my queries. Some searching and
> testing later, my best guess is that I'm hitting DRILL-5451.
> 
> 
> 
> Since this is currently a showstopper for me, and since I might learn
> something by doing so, I thought I'd give it a go and try to debug this
> problem and see if I might be able to contribute back.
> 
> 
> 
> I'm finding that when it comes to debugging, I really don't know what I'm
> doing, and could use some help. 
> 
> 
> 
> Preferably, help made up of small words.
> 
> 
> 
> I'm running Drill 1.11.0 on Windows 7. To start Drill in debug mode, my best
> guess was to edit line 182 of sqlline.bat to read:
> 
> 
> 
> set DRILL_SHELL_JAVA_OPTS=%DRILL_SHELL_JAVA_OPTS%
> -Dlog.path="%DRILL_LOG_DIR%\sqlline.log"
> -Dlog.query.path="%DRILL_LOG_DIR%\sqlline_queries.log" -Xdebug
> -Xrunjdwp:transport=dt_socket,server=y,suspend=y,address=8000
> 
> 
> 
> When I run sqlline.bat -u "jdbc:drill:sz=local" now, I get a message
> "Listening for transport dt_socket at address: 8000" which, I expect, means
> that the runtime is waiting for the debugger to attach.
> 
> 
> 
> I downloaded the Drill source and built with Maven. Using Eclipse Neon, I
> imported the project. I created a debug configuration for a Remote Java
> Application for project drill-common, on machine localhost:8000.
> 
> 
> 
> When I start the debug session, sqlline finishes launching and I get a
> prompt at which I can enter a SQL command. Which suggests to me that the
> debugger is in fact attached.
> 
> 
> 
> Inside Eclipse, I set breakpoints on lines 114 and 122 of
> drill-java-exec/src/main/java/io.netty.buffer/DrillBuf.java.
> 
> 
> 
> However, when I repro my issue, I get the IndexOutOfBoundsException in
> sqlline, but there's no indication in Eclipse that the debugger has broken
> in, and I see no facilities to examine the stack trace or the local
> variables.
> 
> 
> 
> What do I do next?
> 



Newbie: Help debugging Drill

2017-10-02 Thread Matthew Mucker
I became a new Drill user last week only to discover that Drill would crash
with an IndexOutOfBounds exception on one of my queries. Some searching and
testing later, my best guess is that I'm hitting DRILL-5451.

 

Since this is currently a showstopper for me, and since I might learn
something by doing so, I thought I'd give it a go and try to debug this
problem and see if I might be able to contribute back.

 

I'm finding that when it comes to debugging, I really don't know what I'm
doing, and could use some help. 

 

Preferably, help made up of small words.

 

I'm running Drill 1.11.0 on Windows 7. To start Drill in debug mode, my best
guess was to edit line 182 of sqlline.bat to read:

 

set DRILL_SHELL_JAVA_OPTS=%DRILL_SHELL_JAVA_OPTS%
-Dlog.path="%DRILL_LOG_DIR%\sqlline.log"
-Dlog.query.path="%DRILL_LOG_DIR%\sqlline_queries.log" -Xdebug
-Xrunjdwp:transport=dt_socket,server=y,suspend=y,address=8000

 

When I run sqlline.bat -u "jdbc:drill:sz=local" now, I get a message
"Listening for transport dt_socket at address: 8000" which, I expect, means
that the runtime is waiting for the debugger to attach.

 

I downloaded the Drill source and built with Maven. Using Eclipse Neon, I
imported the project. I created a debug configuration for a Remote Java
Application for project drill-common, on machine localhost:8000.

 

When I start the debug session, sqlline finishes launching and I get a
prompt at which I can enter a SQL command. Which suggests to me that the
debugger is in fact attached.

 

Inside Eclipse, I set breakpoints on lines 114 and 122 of
drill-java-exec/src/main/java/io.netty.buffer/DrillBuf.java.

 

However, when I repro my issue, I get the IndexOutOfBoundsException in
sqlline, but there's no indication in Eclipse that the debugger has broken
in, and I see no facilities to examine the stack trace or the local
variables.

 

What do I do next?



[GitHub] drill pull request #963: DRILL-5259: Allow listing a user-defined number of ...

2017-10-02 Thread kkhatua
Github user kkhatua commented on a diff in the pull request:

https://github.com/apache/drill/pull/963#discussion_r142240395
  
--- Diff: exec/java-exec/src/main/resources/rest/profile/list.ftl ---
@@ -37,7 +37,42 @@
   No running queries.
 
   
-  Completed Queries
+  
+
+//Validate that the fetch number is valid
+function checkMaxFetch() {
+  var maxFetch = document.forms["profileFetch"]["max"].value;
+  console.log("maxFetch: " + maxFetch);
+  if (isNaN(maxFetch) || (maxFetch < 1) || (maxFetch > 10) ) {
+alert("Invalid Entry: " + maxFetch + "\n" +
+   Please enter a valid number of profiles to fetch (1 to 
10) ");
+return false;
+  }
+  return true;
+}
+
+
+  Completed Queries
+  
+Showing ${model.getFinishedQueries()?size} profiles. Max: 
+
--- End diff --

Done. 
1. The keyword is now 'Refresh'.
2. The default fetch is populated into the textbox. Any subsequent changes 
are retained, so the user does not have to re-enter a value to trigger a 
reload. Only click on Refresh. 



---


[jira] [Created] (DRILL-5832) Migrate OperatorFixture to use SystemOptionManager rather than mock

2017-10-02 Thread Paul Rogers (JIRA)
Paul Rogers created DRILL-5832:
--

 Summary: Migrate OperatorFixture to use SystemOptionManager rather 
than mock
 Key: DRILL-5832
 URL: https://issues.apache.org/jira/browse/DRILL-5832
 Project: Apache Drill
  Issue Type: Improvement
Affects Versions: 1.12.0
Reporter: Paul Rogers
Assignee: Paul Rogers
 Fix For: 1.12.0


The {{OperatorFixture}} provides structure for testing individual operators and 
other "sub-operator" bits of code. To do that, the framework provides mock 
network-free and server-free versions of the fragment context and operator 
context.

As part of the mock, the {{OperatorFixture}} provides a mock version of the 
system option manager that provides a simple test-only implementation of an 
option set.

With the recent major changes to the system option manager, this mock 
implementation has drifted out of sync with the system option manager. Rather 
than upgrading the mock implementation, this ticket asks to use the system 
option manager directly -- but configured for no ZK or file persistence of 
options.

The key reason for this change is that the system option manager has 
implemented a sophisticated way to handle option defaults; it is better to 
leverage that than to provide a mock implementation.




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] drill issue #968: DRILL-5830: Resolve regressions to MapR DB from DRILL-5546

2017-10-02 Thread paul-rogers
Github user paul-rogers commented on the issue:

https://github.com/apache/drill/pull/968
  
@jinfengni, how unreliable is the project push-down rule? In general, it is 
hard to reason about software when parts are unreliable. Should every use of 
the projection push down rule implement a "just in case" fallback as was done 
for HBase? Should we add that to all storage plugins?

Or, can we fix the projection push-down rule to make it more reliable? It 
seems that one of your fixes did just that. An HBase query before your fix to 
the rule didn't expand the wildcard, but the same query, after your projection 
rule fix, does expand the wildcard.

What are the scenarios in which the rule does not work? Can we fix them in 
the planner? 


---


[GitHub] drill issue #968: DRILL-5830: Resolve regressions to MapR DB from DRILL-5546

2017-10-02 Thread paul-rogers
Github user paul-rogers commented on the issue:

https://github.com/apache/drill/pull/968
  
Rebased onto latest master.


---


[GitHub] drill pull request #954: DRILL-5803: Show the hostname for each minor fragme...

2017-10-02 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/drill/pull/954


---


[GitHub] drill pull request #967: DRILL-5564: Added finally block for stopWait() to a...

2017-10-02 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/drill/pull/967


---


[GitHub] drill pull request #966: DRILL-5824: Retain original memory limit for 1st ph...

2017-10-02 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/drill/pull/966


---


[GitHub] drill pull request #959: DRILL-5816: Hash function produces skewed results o...

2017-10-02 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/drill/pull/959


---


[GitHub] drill pull request #964: DRILL-5755 Reduced default number of batches kept i...

2017-10-02 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/drill/pull/964


---


[GitHub] drill pull request #965: DRILL-5811 reduced repeated log messages further.

2017-10-02 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/drill/pull/965


---


[GitHub] drill pull request #961: DRILL-5792: CONVERT_FROM_JSON on an empty file thro...

2017-10-02 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/drill/pull/961


---


[GitHub] drill pull request #962: DRILL-5820: Add support for libpam4j Pam Authentica...

2017-10-02 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/drill/pull/962


---


[GitHub] drill pull request #946: DRILL-5799: native-client: Support alternative buil...

2017-10-02 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/drill/pull/946


---


[GitHub] drill issue #949: DRILL-5795: Parquet Filter push down at rowgroup level

2017-10-02 Thread dprofeta
Github user dprofeta commented on the issue:

https://github.com/apache/drill/pull/949
  
@paul-rogers done


---


[GitHub] drill pull request #510: DRILL-4139: Add missing BIT support for Parquet par...

2017-10-02 Thread pmaciolek
Github user pmaciolek closed the pull request at:

https://github.com/apache/drill/pull/510


---


[GitHub] drill issue #510: DRILL-4139: Add missing BIT support for Parquet partition ...

2017-10-02 Thread vvysotskyi
Github user vvysotskyi commented on the issue:

https://github.com/apache/drill/pull/510
  
@pmaciolek, could you please close this pull request, since it was merged 
in 1ea191fa351b29847e2358f5777982d602cf5ec3


---


[GitHub] drill issue #968: DRILL-5830: Resolve regressions to MapR DB from DRILL-5546

2017-10-02 Thread jinfengni
Github user jinfengni commented on the issue:

https://github.com/apache/drill/pull/968
  
I'm not convinced that it's a good idea to back out the change to HBase 
specific changes made in DRILL-5546.

You are right that project push-down in planner ideally should do its job 
and push the list of columns. However, until planner could claim there is no 
issues at all ( just like what happened prior to DRILL-5546), execution still 
may face a columns list with "*". That's why in HBaseGroupScan we have 
verifyColumnsAndConvertStar, just in case planner's project push-down did not 
work in the way we want.  If as you suggested, such conversion in 
HBaseGroupScan is redundant, why would it cause regression (if planner's rule 
works as expected)? or is it your intention to still keep "*" in 
HBaseRecordReader? If that's your intention, I think we are going in the wrong 
direction. HBase has a unified schema at table level.  

I agree that the analysis of empty map {},  vs {a:varbinary}.  It's 
something we have to deal with. As a matter of fact, such scenarios does not 
have to come from empty batch. It could happen with two regions with >0 rows.  

For instance, regrion 1 has 10 rows, with cf1.c1 appears in only first 5 
rows, while region 2 has 20 rows with cf1.c1 appears in every rows. For the 
following query:
select CF1 FROM table where some_condition_on_row_key;

if "some_condition_on_row_key" is pushed to hbase and prunes the first 5 
rows, region1 will return a batch with 5 rows, but with cf1 as an empty map, 
while region2 will have map with cf1 as {c1:varbinary}.  

In that sense, DRILL-5546 exposes such issues, and force us to have a 
solution to handle empty map {} vs {a:varbinary}/



---