Re: Drill: Memory Spilling for the Hash Aggregate Operator

2017-01-13 Thread Julian Hyde
The attachment didn’t come through. I’m hoping that you settled on a “hybrid” 
hash algorithm that can write to disk, or write to memory, and the cost of 
discovering that is wrong is not too great. With Goetz Graefe’s hybrid hash 
join (which can be easily adapted to hybrid hash aggregate) if the input ALMOST 
fits in memory you could process most of it in memory, then revisit the stuff 
you spilled to disk.

> On Jan 13, 2017, at 7:46 PM, Boaz Ben-Zvi  wrote:
> 
>  Hi Drill developers,
> 
>  Attached is a document describing the design for memory spilling 
> implementation for the Hash Aggregate operator.
> 
>  Please send me any comments or questions,
> 
> -- Boaz



[GitHub] drill pull request #721: DRILL-5172: Display elapsed time for queries in the...

2017-01-13 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/drill/pull/721


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request #696: DRILL-4558: BSonReader should prepare buffer size a...

2017-01-13 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/drill/pull/696


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request #695: DRILL-4868: fix how hive function set DrillBuf.

2017-01-13 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/drill/pull/695


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill issue #695: DRILL-4868: fix how hive function set DrillBuf.

2017-01-13 Thread parthchandra
Github user parthchandra commented on the issue:

https://github.com/apache/drill/pull/695
  
+1


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill issue #696: DRILL-4558: BSonReader should prepare buffer size as actua...

2017-01-13 Thread parthchandra
Github user parthchandra commented on the issue:

https://github.com/apache/drill/pull/696
  
+1


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request #687: DRILL-4996: Parquet Date auto-correction is not wor...

2017-01-13 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/drill/pull/687


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request #714: DRILL-4919: Fix select count(1) / count(*) on csv w...

2017-01-13 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/drill/pull/714


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request #708: DRILL-5152: Enhance the mock data source: better da...

2017-01-13 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/drill/pull/708


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request #715: DRILL-5105: remove buffer size checking in getBuffe...

2017-01-13 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/drill/pull/715


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill issue #714: DRILL-4919: Fix select count(1) / count(*) on csv with hea...

2017-01-13 Thread parthchandra
Github user parthchandra commented on the issue:

https://github.com/apache/drill/pull/714
  
+1


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill issue #715: DRILL-5105: remove buffer size checking in getBuffers sinc...

2017-01-13 Thread parthchandra
Github user parthchandra commented on the issue:

https://github.com/apache/drill/pull/715
  
+1


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill issue #687: DRILL-4996: Parquet Date auto-correction is not working in...

2017-01-13 Thread parthchandra
Github user parthchandra commented on the issue:

https://github.com/apache/drill/pull/687
  
+1


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (DRILL-5196) Could not run a single MongoDB unit test case through command line or IDE

2017-01-13 Thread Chunhui Shi (JIRA)
Chunhui Shi created DRILL-5196:
--

 Summary: Could not run a single MongoDB unit test case through 
command line or IDE
 Key: DRILL-5196
 URL: https://issues.apache.org/jira/browse/DRILL-5196
 Project: Apache Drill
  Issue Type: Bug
Reporter: Chunhui Shi
Assignee: Chunhui Shi


Could not run a single MongoDB's unit test through IDE or command line. The 
reason is when running a single test case, the MongoDB instance did not get 
started thus a 'table not found' error for 'mongo.employee.empinfo' would be 
raised.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-5195) Publish Operator and MajorFragment Stats in Profile page

2017-01-13 Thread Kunal Khatua (JIRA)
Kunal Khatua created DRILL-5195:
---

 Summary: Publish Operator and MajorFragment Stats in Profile page
 Key: DRILL-5195
 URL: https://issues.apache.org/jira/browse/DRILL-5195
 Project: Apache Drill
  Issue Type: Improvement
  Components: Web Server
Affects Versions: 1.9.0
Reporter: Kunal Khatua
Assignee: Kunal Khatua


Currently, we show runtimes for major fragments, and min,max,avg times for 
setup, processing and waiting for various operators.

It would be worthwhile to have additional stats for the following:
MajorFragment
  %Busy - % of the active time for all the minor fragments within each major 
fragment that they were busy. 

Operator Profile
  %Busy - % of the active time for all the fragments within each operator that 
they were busy. 
  Records - Total number of records propagated out by that operator.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] drill pull request #696: DRILL-4558: BSonReader should prepare buffer size a...

2017-01-13 Thread paul-rogers
Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/696#discussion_r96048977
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/exec/store/bson/TestBsonRecordReader.java
 ---
@@ -103,6 +103,17 @@ public void testStringType() throws IOException {
   }
 
   @Test
+  public void testSpecialCharStringType() throws IOException {
+BsonDocument bsonDoc = new BsonDocument();
+bsonDoc.append("stringKey", new BsonString("§§§§§§§§§1"));
--- End diff --

Might want to put your value in a string, convert to byte but, and assert 
that the byte buf length differs in length from the string length. This will 
verify that this test is, in fact, testing the problem case.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill issue #714: DRILL-4919: Fix select count(1) / count(*) on csv with hea...

2017-01-13 Thread gparai
Github user gparai commented on the issue:

https://github.com/apache/drill/pull/714
  
+1 LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill issue #685: Drill 5043: Function that returns a unique id per session/...

2017-01-13 Thread nagarajanchinnasamy
Github user nagarajanchinnasamy commented on the issue:

https://github.com/apache/drill/pull/685
  
Sent a mail to Dev list with the findings... issue yet to be resolved:

[Link to the 
mail](http://mail-archives.apache.org/mod_mbox/drill-dev/201701.mbox/%3ccacd9g6hwrtfckd_2jhmrext_tb0zwcrxtb7j5digkmy9atk...@mail.gmail.com%3E)



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---