Re: Maven build failing on checkstyle

2015-09-09 Thread Edmon Begoli
and I am sorry to bug you with this but to me, this was a prefectly
formatted javadoc and I was surprised to see build failing on it:

/** Abstract class for StorePlugin implementations.
 * See StoragePlugin for description of the interface intent and its
methods.
 */
public abstract class AbstractStoragePlugin implements StoragePlugin{
  static final org.slf4j.Logger logger =
org.slf4j.LoggerFactory.getLogger(AbstractStoragePlugin.class);

However, it had a space before the end of the line first line, and
checkstyle did not like it. I was using vim, not IDE.

I am switching to IDEA ...


On Tue, Sep 8, 2015 at 11:48 PM, Edmon Begoli  wrote:

> I am running build on my fork, and Maven build is failing on the
> checkstyle:
>
> excerpt ...
>
> [INFO] --- maven-checkstyle-plugin:2.12.1:check (checkstyle-validation) @
> drill-java-exec ---
>
> [INFO] Starting audit...
>
> /Users/ebegoli/drill/exec/java-exec/src/main/java/org/apache/drill/exec/store/AbstractStoragePlugin.java:31:
> Line matches the illegal pattern '\s+$'.
>
> /Users/ebegoli/drill/exec/java-exec/src/main/java/org/apache/drill/exec/store/AbstractStoragePlugin.java:33:
> Line matches the illegal pattern '\s+$'.
>
> /Users/ebegoli/drill/exec/java-exec/src/main/java/org/apache/drill/exec/store/dfs/easy/EasyFormatPlugin.java:118:
> Line matches the illegal pattern '\s+$'.
>
> /Users/ebegoli/drill/exec/java-exec/src/main/java/org/apache/drill/exec/store/StoragePlugin.java:30:
> Line matches the illegal pattern '\s+$'.
>
> /Users/ebegoli/drill/exec/java-exec/src/main/java/org/apache/drill/exec/store/StoragePlugin.java:35:
> Line matches the illegal pattern '\s+$'.
>
> /Users/ebegoli/drill/exec/java-exec/src/main/java/org/apache/drill/exec/store/StoragePlugin.java:44:
> Line matches the illegal pattern '\s+$'.
>
> /Users/ebegoli/drill/exec/java-exec/src/main/java/org/apache/drill/exec/store/StoragePlugin.java:45:
> Line matches the illegal pattern '\s+$'.
>
> /Users/ebegoli/drill/exec/java-exec/src/main/java/org/apache/drill/exec/store/StoragePlugin.java:71:
> Line matches the illegal pattern '\s+$'.
>
> /Users/ebegoli/drill/exec/java-exec/src/main/java/org/apache/drill/exec/store/StoragePlugin.java:74:
> Line matches the illegal pattern '\s+$'.
>
> Audit done.
>
> It looks like Javadoc checkstyle if failing. These are included in my pull:
>
> https://github.com/apache/drill/pull/139
>
>
> Can someone please advise how do I and should I either suppress these or
> fix the issue.
>
> It is a properly structured javadoc. Starts with /** and ends with */.
>
> Not sure what else is required, but I will happy to fix it to make it pass
> the checkstyle.
>
>
>
>
>
>
>


Re: Review Request 37893: DRILL-3718: TSV reader fails when "" appears

2015-09-09 Thread Sean Hsuan-Yi Chu


> On Sept. 8, 2015, 3:47 a.m., Jacques Nadeau wrote:
> > exec/java-exec/src/test/java/org/apache/drill/TestExampleQueries.java, line 
> > 1105
> > 
> >
> > Please move text tests to their own file. I believe we already have a 
> > TestTextReader.
> > 
> > Can you also please add another file with a different format to prove 
> > that this fix works for multiple delimiters?
> > 
> > Lastly, you should a small comment to the condition where you've added 
> > it.
> 
> Sean Hsuan-Yi Chu wrote:
> 1. Done
> 2. By "multiple", I am assuming you meant like: "a"\t\t"a"? (Please see 
> the diff)
> 3. Done
> 
> Jacques Nadeau wrote:
> I mean try with something other than a tab as the delimiter.  For 
> example, try it with a space delimiter.

Done


- Sean Hsuan-Yi


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/37893/#review97962
---


On Sept. 9, 2015, 12:09 a.m., Sean Hsuan-Yi Chu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/37893/
> ---
> 
> (Updated Sept. 9, 2015, 12:09 a.m.)
> 
> 
> Review request for drill, Jacques Nadeau and Mehant Baid.
> 
> 
> Bugs: DRILL-3718
> https://issues.apache.org/jira/browse/DRILL-3718
> 
> 
> Repository: drill-git
> 
> 
> Description
> ---
> 
> For TSV files, if the TextReader reads a double quote, it would keep scanning 
> until it gets the second double quote.
> 
> However, even getting the second double quote, the current reader will keep 
> going in order to trim the space (i.e., ' '). 
> 
> In tsv, there is no need to trim '\t' (tab), which is used to separate fields.
> 
> 
> Diffs
> -
> 
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/store/easy/text/compliant/TextReader.java
>  3899509 
>   exec/java-exec/src/test/java/org/apache/drill/TestExampleQueries.java 
> d15cff2 
>   
> exec/java-exec/src/test/java/org/apache/drill/exec/store/text/TestNewTextReader.java
>  e63e528 
>   exec/java-exec/src/test/resources/store/text/WithQuote.tsv PRE-CREATION 
>   exec/java-exec/src/test/resources/store/text/WithQuoteMultiDelimiters.tsv 
> PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/37893/diff/
> 
> 
> Testing
> ---
> 
> All
> 
> 
> Thanks,
> 
> Sean Hsuan-Yi Chu
> 
>



Re: Review Request 37893: DRILL-3718: TSV reader fails when "" appears

2015-09-09 Thread Sean Hsuan-Yi Chu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/37893/
---

(Updated Sept. 9, 2015, 3:06 p.m.)


Review request for drill, Jacques Nadeau and Mehant Baid.


Changes
---

Addressed comments


Bugs: DRILL-3718
https://issues.apache.org/jira/browse/DRILL-3718


Repository: drill-git


Description
---

For TSV files, if the TextReader reads a double quote, it would keep scanning 
until it gets the second double quote.

However, even getting the second double quote, the current reader will keep 
going in order to trim the space (i.e., ' '). 

In tsv, there is no need to trim '\t' (tab), which is used to separate fields.


Diffs (updated)
-

  
exec/java-exec/src/main/java/org/apache/drill/exec/store/easy/text/compliant/TextReader.java
 3899509 
  exec/java-exec/src/test/java/org/apache/drill/TestExampleQueries.java d15cff2 
  
exec/java-exec/src/test/java/org/apache/drill/exec/store/text/TestNewTextReader.java
 e63e528 
  exec/java-exec/src/test/resources/store/text/WithQuote.tsv PRE-CREATION 
  exec/java-exec/src/test/resources/store/text/WithQuoteMultiDelimiters.tsv 
PRE-CREATION 

Diff: https://reviews.apache.org/r/37893/diff/


Testing
---

All


Thanks,

Sean Hsuan-Yi Chu



Re: Review Request 37893: DRILL-3718: TSV reader fails when "" appears

2015-09-09 Thread Jacques Nadeau

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/37893/#review98199
---


Most of this looks good. Let me restate my additional request as I think I'm 
being unclear. 

You are currently testing tsv files (tab-delimited files). I want to you test 
the same type of conditions with another type of delimiter. For example, space 
delimited files (e.g. .ssv). This will verify that your solution works for 
multiple delimiter types.

- Jacques Nadeau


On Sept. 9, 2015, 3:06 p.m., Sean Hsuan-Yi Chu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/37893/
> ---
> 
> (Updated Sept. 9, 2015, 3:06 p.m.)
> 
> 
> Review request for drill, Jacques Nadeau and Mehant Baid.
> 
> 
> Bugs: DRILL-3718
> https://issues.apache.org/jira/browse/DRILL-3718
> 
> 
> Repository: drill-git
> 
> 
> Description
> ---
> 
> For TSV files, if the TextReader reads a double quote, it would keep scanning 
> until it gets the second double quote.
> 
> However, even getting the second double quote, the current reader will keep 
> going in order to trim the space (i.e., ' '). 
> 
> In tsv, there is no need to trim '\t' (tab), which is used to separate fields.
> 
> 
> Diffs
> -
> 
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/store/easy/text/compliant/TextReader.java
>  3899509 
>   exec/java-exec/src/test/java/org/apache/drill/TestExampleQueries.java 
> d15cff2 
>   
> exec/java-exec/src/test/java/org/apache/drill/exec/store/text/TestNewTextReader.java
>  e63e528 
>   exec/java-exec/src/test/resources/store/text/WithQuote.tsv PRE-CREATION 
>   exec/java-exec/src/test/resources/store/text/WithQuoteMultiDelimiters.tsv 
> PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/37893/diff/
> 
> 
> Testing
> ---
> 
> All
> 
> 
> Thanks,
> 
> Sean Hsuan-Yi Chu
> 
>



Review/merge DRILL-3641?

2015-09-09 Thread Daniel Barclay

Could someone please review and merge the patch for DRILL-3641 
 that is in the pull request at 
Pull Request #113 ?

(It has been pending for several weeks, and it's just a 
single-enumeration-class documentation patch.)

Thanks,
Daniel

--
Daniel Barclay
MapR Technologies



Re: Review Request 37893: DRILL-3718: TSV reader fails when "" appears

2015-09-09 Thread Sean Hsuan-Yi Chu


> On Sept. 9, 2015, 3:11 p.m., Jacques Nadeau wrote:
> > Most of this looks good. Let me restate my additional request as I think 
> > I'm being unclear. 
> > 
> > You are currently testing tsv files (tab-delimited files). I want to you 
> > test the same type of conditions with another type of delimiter. For 
> > example, space delimited files (e.g. .ssv). This will verify that your 
> > solution works for multiple delimiter types.

Oh... I uploaded the previous version. 

Please check the latest one. (I also updated the pom file and boot-strap to 
make unit test recognize .ssv)


- Sean Hsuan-Yi


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/37893/#review98199
---


On Sept. 9, 2015, 3:06 p.m., Sean Hsuan-Yi Chu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/37893/
> ---
> 
> (Updated Sept. 9, 2015, 3:06 p.m.)
> 
> 
> Review request for drill, Jacques Nadeau and Mehant Baid.
> 
> 
> Bugs: DRILL-3718
> https://issues.apache.org/jira/browse/DRILL-3718
> 
> 
> Repository: drill-git
> 
> 
> Description
> ---
> 
> For TSV files, if the TextReader reads a double quote, it would keep scanning 
> until it gets the second double quote.
> 
> However, even getting the second double quote, the current reader will keep 
> going in order to trim the space (i.e., ' '). 
> 
> In tsv, there is no need to trim '\t' (tab), which is used to separate fields.
> 
> 
> Diffs
> -
> 
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/store/easy/text/compliant/TextReader.java
>  3899509 
>   exec/java-exec/src/test/java/org/apache/drill/TestExampleQueries.java 
> d15cff2 
>   
> exec/java-exec/src/test/java/org/apache/drill/exec/store/text/TestNewTextReader.java
>  e63e528 
>   exec/java-exec/src/test/resources/store/text/WithQuote.tsv PRE-CREATION 
>   exec/java-exec/src/test/resources/store/text/WithQuoteMultiDelimiters.tsv 
> PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/37893/diff/
> 
> 
> Testing
> ---
> 
> All
> 
> 
> Thanks,
> 
> Sean Hsuan-Yi Chu
> 
>



Re: Review Request 37893: DRILL-3718: TSV reader fails when "" appears

2015-09-09 Thread Sean Hsuan-Yi Chu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/37893/
---

(Updated Sept. 9, 2015, 4:33 p.m.)


Review request for drill, Jacques Nadeau and Mehant Baid.


Changes
---

Uploaded the latest patch


Bugs: DRILL-3718
https://issues.apache.org/jira/browse/DRILL-3718


Repository: drill-git


Description
---

For TSV files, if the TextReader reads a double quote, it would keep scanning 
until it gets the second double quote.

However, even getting the second double quote, the current reader will keep 
going in order to trim the space (i.e., ' '). 

In tsv, there is no need to trim '\t' (tab), which is used to separate fields.


Diffs (updated)
-

  
exec/java-exec/src/main/java/org/apache/drill/exec/store/easy/text/compliant/TextReader.java
 3899509 
  exec/java-exec/src/test/java/org/apache/drill/TestExampleQueries.java d15cff2 
  
exec/java-exec/src/test/java/org/apache/drill/exec/store/text/TestNewTextReader.java
 e63e528 
  exec/java-exec/src/test/resources/bootstrap-storage-plugins.json 4a7a53f 
  exec/java-exec/src/test/resources/store/text/WithQuote.ssv PRE-CREATION 
  exec/java-exec/src/test/resources/store/text/WithQuote.tbl PRE-CREATION 
  exec/java-exec/src/test/resources/store/text/WithQuote.tsv PRE-CREATION 
  pom.xml 8d4b318 

Diff: https://reviews.apache.org/r/37893/diff/


Testing
---

All


Thanks,

Sean Hsuan-Yi Chu



Directory and file based partition pruning

2015-09-09 Thread Aman Sinha
Currently, partition pruning gets all file names in the table and applies
the pruning.  Suppose the files are spread out over several directories and
there is a filter  on dirN,  this is not efficient - both in terms of
elapsed time and memory usage.  This has been seen in a few use cases
recently.

We should ideally perform the pruning in 2 steps:  first get the top-level
directory names only and apply the directory filter, then get the filenames
within that directory and apply remaining filters.

I will create a JIRA for this enhancement but let me know your thoughts...

Aman


Re: Maven build failing on checkstyle

2015-09-09 Thread Ted Dunning
Checkstyle is clearly being too picky here.

The only problem with spaces at the end of a line is that some tools strip
them out automagically.  This leads to format changes that make reviews
(very slightly) more difficult.

I would be willing to fix the checkstyle profile to be less draconian if
you would be willing to file the JIRA.



On Wed, Sep 9, 2015 at 5:14 AM, Edmon Begoli  wrote:

> and I am sorry to bug you with this but to me, this was a prefectly
> formatted javadoc and I was surprised to see build failing on it:
>
> /** Abstract class for StorePlugin implementations.
>  * See StoragePlugin for description of the interface intent and its
> methods.
>  */
> public abstract class AbstractStoragePlugin implements StoragePlugin{
>   static final org.slf4j.Logger logger =
> org.slf4j.LoggerFactory.getLogger(AbstractStoragePlugin.class);
>
> However, it had a space before the end of the line first line, and
> checkstyle did not like it. I was using vim, not IDE.
>
> I am switching to IDEA ...
>
>
> On Tue, Sep 8, 2015 at 11:48 PM, Edmon Begoli  wrote:
>
> > I am running build on my fork, and Maven build is failing on the
> > checkstyle:
> >
> > excerpt ...
> >
> > [INFO] --- maven-checkstyle-plugin:2.12.1:check (checkstyle-validation) @
> > drill-java-exec ---
> >
> > [INFO] Starting audit...
> >
> >
> /Users/ebegoli/drill/exec/java-exec/src/main/java/org/apache/drill/exec/store/AbstractStoragePlugin.java:31:
> > Line matches the illegal pattern '\s+$'.
> >
> >
> /Users/ebegoli/drill/exec/java-exec/src/main/java/org/apache/drill/exec/store/AbstractStoragePlugin.java:33:
> > Line matches the illegal pattern '\s+$'.
> >
> >
> /Users/ebegoli/drill/exec/java-exec/src/main/java/org/apache/drill/exec/store/dfs/easy/EasyFormatPlugin.java:118:
> > Line matches the illegal pattern '\s+$'.
> >
> >
> /Users/ebegoli/drill/exec/java-exec/src/main/java/org/apache/drill/exec/store/StoragePlugin.java:30:
> > Line matches the illegal pattern '\s+$'.
> >
> >
> /Users/ebegoli/drill/exec/java-exec/src/main/java/org/apache/drill/exec/store/StoragePlugin.java:35:
> > Line matches the illegal pattern '\s+$'.
> >
> >
> /Users/ebegoli/drill/exec/java-exec/src/main/java/org/apache/drill/exec/store/StoragePlugin.java:44:
> > Line matches the illegal pattern '\s+$'.
> >
> >
> /Users/ebegoli/drill/exec/java-exec/src/main/java/org/apache/drill/exec/store/StoragePlugin.java:45:
> > Line matches the illegal pattern '\s+$'.
> >
> >
> /Users/ebegoli/drill/exec/java-exec/src/main/java/org/apache/drill/exec/store/StoragePlugin.java:71:
> > Line matches the illegal pattern '\s+$'.
> >
> >
> /Users/ebegoli/drill/exec/java-exec/src/main/java/org/apache/drill/exec/store/StoragePlugin.java:74:
> > Line matches the illegal pattern '\s+$'.
> >
> > Audit done.
> >
> > It looks like Javadoc checkstyle if failing. These are included in my
> pull:
> >
> > https://github.com/apache/drill/pull/139
> >
> >
> > Can someone please advise how do I and should I either suppress these or
> > fix the issue.
> >
> > It is a properly structured javadoc. Starts with /** and ends with */.
> >
> > Not sure what else is required, but I will happy to fix it to make it
> pass
> > the checkstyle.
> >
> >
> >
> >
> >
> >
> >
>


Re: Maven build failing on checkstyle

2015-09-09 Thread Jim Scott
I would suggest not changing the checkstyle and having anyone wanting to
commit code back to configure their IDE's to just strip spaces at the ends
of lines. Every major IDE has supported this for nearly 10 years.

On Wed, Sep 9, 2015 at 1:17 PM, Ted Dunning  wrote:

> Checkstyle is clearly being too picky here.
>
> The only problem with spaces at the end of a line is that some tools strip
> them out automagically.  This leads to format changes that make reviews
> (very slightly) more difficult.
>
> I would be willing to fix the checkstyle profile to be less draconian if
> you would be willing to file the JIRA.
>
>
>
> On Wed, Sep 9, 2015 at 5:14 AM, Edmon Begoli  wrote:
>
> > and I am sorry to bug you with this but to me, this was a prefectly
> > formatted javadoc and I was surprised to see build failing on it:
> >
> > /** Abstract class for StorePlugin implementations.
> >  * See StoragePlugin for description of the interface intent and its
> > methods.
> >  */
> > public abstract class AbstractStoragePlugin implements StoragePlugin{
> >   static final org.slf4j.Logger logger =
> > org.slf4j.LoggerFactory.getLogger(AbstractStoragePlugin.class);
> >
> > However, it had a space before the end of the line first line, and
> > checkstyle did not like it. I was using vim, not IDE.
> >
> > I am switching to IDEA ...
> >
> >
> > On Tue, Sep 8, 2015 at 11:48 PM, Edmon Begoli  wrote:
> >
> > > I am running build on my fork, and Maven build is failing on the
> > > checkstyle:
> > >
> > > excerpt ...
> > >
> > > [INFO] --- maven-checkstyle-plugin:2.12.1:check
> (checkstyle-validation) @
> > > drill-java-exec ---
> > >
> > > [INFO] Starting audit...
> > >
> > >
> >
> /Users/ebegoli/drill/exec/java-exec/src/main/java/org/apache/drill/exec/store/AbstractStoragePlugin.java:31:
> > > Line matches the illegal pattern '\s+$'.
> > >
> > >
> >
> /Users/ebegoli/drill/exec/java-exec/src/main/java/org/apache/drill/exec/store/AbstractStoragePlugin.java:33:
> > > Line matches the illegal pattern '\s+$'.
> > >
> > >
> >
> /Users/ebegoli/drill/exec/java-exec/src/main/java/org/apache/drill/exec/store/dfs/easy/EasyFormatPlugin.java:118:
> > > Line matches the illegal pattern '\s+$'.
> > >
> > >
> >
> /Users/ebegoli/drill/exec/java-exec/src/main/java/org/apache/drill/exec/store/StoragePlugin.java:30:
> > > Line matches the illegal pattern '\s+$'.
> > >
> > >
> >
> /Users/ebegoli/drill/exec/java-exec/src/main/java/org/apache/drill/exec/store/StoragePlugin.java:35:
> > > Line matches the illegal pattern '\s+$'.
> > >
> > >
> >
> /Users/ebegoli/drill/exec/java-exec/src/main/java/org/apache/drill/exec/store/StoragePlugin.java:44:
> > > Line matches the illegal pattern '\s+$'.
> > >
> > >
> >
> /Users/ebegoli/drill/exec/java-exec/src/main/java/org/apache/drill/exec/store/StoragePlugin.java:45:
> > > Line matches the illegal pattern '\s+$'.
> > >
> > >
> >
> /Users/ebegoli/drill/exec/java-exec/src/main/java/org/apache/drill/exec/store/StoragePlugin.java:71:
> > > Line matches the illegal pattern '\s+$'.
> > >
> > >
> >
> /Users/ebegoli/drill/exec/java-exec/src/main/java/org/apache/drill/exec/store/StoragePlugin.java:74:
> > > Line matches the illegal pattern '\s+$'.
> > >
> > > Audit done.
> > >
> > > It looks like Javadoc checkstyle if failing. These are included in my
> > pull:
> > >
> > > https://github.com/apache/drill/pull/139
> > >
> > >
> > > Can someone please advise how do I and should I either suppress these
> or
> > > fix the issue.
> > >
> > > It is a properly structured javadoc. Starts with /** and ends with */.
> > >
> > > Not sure what else is required, but I will happy to fix it to make it
> > pass
> > > the checkstyle.
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> >
>



-- 
*Jim Scott*
Director, Enterprise Strategy & Architecture
+1 (347) 746-9281
@kingmesal 


[image: MapR Technologies] 

Now Available - Free Hadoop On-Demand Training



[GitHub] drill pull request: DRILL-3347: VARCHAR ResultSet.getObject return...

2015-09-09 Thread adeneche
Github user adeneche commented on the pull request:

https://github.com/apache/drill/pull/144#issuecomment-139001311
  
+1 LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Re: Maven build failing on checkstyle

2015-09-09 Thread Edmon Begoli
I will do whatever your suggest. I stripped spaces and re-comitted and
merged into pull request.

I do think that the stylechecker is being bit draconian, but I can live
with it.
I was just surprised to see build fail on such a small source commenting
issue that is hard to detect for those not using IDE, as this vertically
formatted javadoc is auto-inserted by the tool.

However - I think having checkstyle like this does contribute to better
code quality.

I would also recommend having a checkstyle looking for something meaningful
to code quality such as the presence of javadoc comments on public mehtods,
empty javadoc comments, etc. etc.


I will be happy to file JIRA, and you guys decide if it is worth acting on
it or closing it. I am here to help the project, not to complain :-)


On Wed, Sep 9, 2015 at 2:23 PM, Jim Scott  wrote:

> I would suggest not changing the checkstyle and having anyone wanting to
> commit code back to configure their IDE's to just strip spaces at the ends
> of lines. Every major IDE has supported this for nearly 10 years.
>
> On Wed, Sep 9, 2015 at 1:17 PM, Ted Dunning  wrote:
>
> > Checkstyle is clearly being too picky here.
> >
> > The only problem with spaces at the end of a line is that some tools
> strip
> > them out automagically.  This leads to format changes that make reviews
> > (very slightly) more difficult.
> >
> > I would be willing to fix the checkstyle profile to be less draconian if
> > you would be willing to file the JIRA.
> >
> >
> >
> > On Wed, Sep 9, 2015 at 5:14 AM, Edmon Begoli  wrote:
> >
> > > and I am sorry to bug you with this but to me, this was a prefectly
> > > formatted javadoc and I was surprised to see build failing on it:
> > >
> > > /** Abstract class for StorePlugin implementations.
> > >  * See StoragePlugin for description of the interface intent and its
> > > methods.
> > >  */
> > > public abstract class AbstractStoragePlugin implements StoragePlugin{
> > >   static final org.slf4j.Logger logger =
> > > org.slf4j.LoggerFactory.getLogger(AbstractStoragePlugin.class);
> > >
> > > However, it had a space before the end of the line first line, and
> > > checkstyle did not like it. I was using vim, not IDE.
> > >
> > > I am switching to IDEA ...
> > >
> > >
> > > On Tue, Sep 8, 2015 at 11:48 PM, Edmon Begoli 
> wrote:
> > >
> > > > I am running build on my fork, and Maven build is failing on the
> > > > checkstyle:
> > > >
> > > > excerpt ...
> > > >
> > > > [INFO] --- maven-checkstyle-plugin:2.12.1:check
> > (checkstyle-validation) @
> > > > drill-java-exec ---
> > > >
> > > > [INFO] Starting audit...
> > > >
> > > >
> > >
> >
> /Users/ebegoli/drill/exec/java-exec/src/main/java/org/apache/drill/exec/store/AbstractStoragePlugin.java:31:
> > > > Line matches the illegal pattern '\s+$'.
> > > >
> > > >
> > >
> >
> /Users/ebegoli/drill/exec/java-exec/src/main/java/org/apache/drill/exec/store/AbstractStoragePlugin.java:33:
> > > > Line matches the illegal pattern '\s+$'.
> > > >
> > > >
> > >
> >
> /Users/ebegoli/drill/exec/java-exec/src/main/java/org/apache/drill/exec/store/dfs/easy/EasyFormatPlugin.java:118:
> > > > Line matches the illegal pattern '\s+$'.
> > > >
> > > >
> > >
> >
> /Users/ebegoli/drill/exec/java-exec/src/main/java/org/apache/drill/exec/store/StoragePlugin.java:30:
> > > > Line matches the illegal pattern '\s+$'.
> > > >
> > > >
> > >
> >
> /Users/ebegoli/drill/exec/java-exec/src/main/java/org/apache/drill/exec/store/StoragePlugin.java:35:
> > > > Line matches the illegal pattern '\s+$'.
> > > >
> > > >
> > >
> >
> /Users/ebegoli/drill/exec/java-exec/src/main/java/org/apache/drill/exec/store/StoragePlugin.java:44:
> > > > Line matches the illegal pattern '\s+$'.
> > > >
> > > >
> > >
> >
> /Users/ebegoli/drill/exec/java-exec/src/main/java/org/apache/drill/exec/store/StoragePlugin.java:45:
> > > > Line matches the illegal pattern '\s+$'.
> > > >
> > > >
> > >
> >
> /Users/ebegoli/drill/exec/java-exec/src/main/java/org/apache/drill/exec/store/StoragePlugin.java:71:
> > > > Line matches the illegal pattern '\s+$'.
> > > >
> > > >
> > >
> >
> /Users/ebegoli/drill/exec/java-exec/src/main/java/org/apache/drill/exec/store/StoragePlugin.java:74:
> > > > Line matches the illegal pattern '\s+$'.
> > > >
> > > > Audit done.
> > > >
> > > > It looks like Javadoc checkstyle if failing. These are included in my
> > > pull:
> > > >
> > > > https://github.com/apache/drill/pull/139
> > > >
> > > >
> > > > Can someone please advise how do I and should I either suppress these
> > or
> > > > fix the issue.
> > > >
> > > > It is a properly structured javadoc. Starts with /** and ends with
> */.
> > > >
> > > > Not sure what else is required, but I will happy to fix it to make it
> > > pass
> > > > the checkstyle.
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > >
> >
>
>
>
> --
> *Jim Scott*
> Director, Enterprise Strategy & Architecture
> +1 (347) 746-9281
> @kingmesal 
>
> 
> [image: 

[GitHub] drill pull request: Drill 3580

2015-09-09 Thread hsuanyi
GitHub user hsuanyi opened a pull request:

https://github.com/apache/drill/pull/149

Drill 3580



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/hsuanyi/incubator-drill DRILL-3580

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/149.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #149


commit 0d8db2811dda6a555313a2fd4d9831df1c65833f
Author: Hsuan-Yi Chu 
Date:   2015-09-04T18:35:45Z

DRILL-3580: Bump calcite version to 1.4.0-drill-r1; Add test case

commit 99e2f5b6d2b418ebc5e5b047af155ae84067b8d2
Author: Hsuan-Yi Chu 
Date:   2015-08-21T05:33:11Z

DRILL-3412: Add ProjectWindowTransposeRule to push Project past Window

commit 87e89f9624ffa1e8d845cc3f84976b1f7df6dcc2
Author: Hsuan-Yi Chu 
Date:   2015-08-22T00:11:49Z

DRILL-3683: Add baseline and expected plan for TestWindowFunctions suite

commit 35089c3fab8d43a128fc9460d2065b43b7b735d6
Author: Hsuan-Yi Chu 
Date:   2015-09-04T18:35:45Z

DRILL-2190, DRILL-2313, DRILL-2318: Add test cases

commit 3f77842b102d2d96493df92a4563903dcdbae1cc
Author: Hsuan-Yi Chu 
Date:   2015-09-07T23:47:18Z

DRILL-3280, DRILL-3360, DRILL-3601, DRILL-3649: Add test cases




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request: Drill 3580

2015-09-09 Thread hsuanyi
Github user hsuanyi closed the pull request at:

https://github.com/apache/drill/pull/149


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request: Update Calcite and Add Test cases

2015-09-09 Thread hsuanyi
GitHub user hsuanyi opened a pull request:

https://github.com/apache/drill/pull/150

Update Calcite and Add Test cases



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/hsuanyi/incubator-drill DRILL-TEST

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/150.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #150


commit cf2148052c35a205e5eb92d203d117c2741ee1af
Author: Hsuan-Yi Chu 
Date:   2015-09-08T17:42:51Z

DRILL-3580: Bump calcite version to 1.4.0-drill-test-r2

commit 8683c61acf5d8edb228da912a6779cdf8f098e4d
Author: Hsuan-Yi Chu 
Date:   2015-09-08T17:45:51Z

DRILL-3580: Add test case

commit 45e89506fea9a6ea20a7fcacb0f3f140afa0f0fb
Author: Hsuan-Yi Chu 
Date:   2015-08-21T05:33:11Z

DRILL-3412: Add ProjectWindowTransposeRule to push Project past Window

commit 0817309b72a296b00bf7d3bbca646843018f1f26
Author: Hsuan-Yi Chu 
Date:   2015-08-22T00:11:49Z

DRILL-3683: Add baseline and expected plan for TestWindowFunctions suite

commit 960f80c95198387a315bf04fb39ccfd6a6353180
Author: Hsuan-Yi Chu 
Date:   2015-09-04T18:35:45Z

DRILL-2190, DRILL-2313, DRILL-2318: Add test cases

commit a8b0d28053e4cad07ef1cd2ffa3cc1d5faef3d90
Author: Hsuan-Yi Chu 
Date:   2015-09-07T23:47:18Z

DRILL-3280, DRILL-3360, DRILL-3601, DRILL-3649: Add test cases




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (DRILL-3753) better error message for dropping hbase/hive table

2015-09-09 Thread Chun Chang (JIRA)
Chun Chang created DRILL-3753:
-

 Summary: better error message for dropping hbase/hive table
 Key: DRILL-3753
 URL: https://issues.apache.org/jira/browse/DRILL-3753
 Project: Apache Drill
  Issue Type: Bug
  Components: Functions - Drill
Affects Versions: 1.2.0
Reporter: Chun Chang
Assignee: Mehant Baid
Priority: Minor


commit_id: 0686bc23e8fbbd14fd3bf852893449ef8552439d

Drill can not drop hbase/hive tables yet. But the error message should be 
improved:

{code}
[#2] Query failed:
org.apache.drill.common.exceptions.UserRemoteException: PARSE ERROR: Unable to 
create or drop tables/views. Schema [hbase] is immutable.
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


In list filter evaluation : room for improvement in run-time code generation.

2015-09-09 Thread Jinfeng Ni
Weeks ago there was a message on drill user list, reporting performance
issues caused by in list filter [1].  The query has filter:

WHERE
   c0 IN (v_00, v_01, v_02, v_03, ... )
OR
   c1 IN (v_11, v_11, v_12, v_13, )
OR
   c2 IN ...
OR
   c3 IN ...
OR
   

The profile shows that most of query time is spent on filter evaluation.
One workaround that we recommend was to re-write the query so that the
planner would convert in list into join operation. Turns out that
converting
into join did help improve performance, but not as much as we wanted.

The original query has parquet as the data source. Therefore, the ideal
solution is parquet filter pushdown, which DRILL-1950 would address.

On the other hand, I noticed that there seems to be room for improvement
in the run-time generated code. In particular, for " c0 in (v_00, v_01,
...)",
Drill will evaluate it as :
c0 = v_00  OR c0 = v_01 OR ...

Each reference of "c0" will lead to initialization of vector and holder
assignment in the generated code. There is redundant evaluation for
the common reference.

I put together a patch,which will avoid the redundant evaluation for the
common reference.  Using TPCH scale factor 10's lineitem table, I saw
quite surprising improvement. (run on Mac with embedded drillbit)

1) In List uses integer type [2]
  master branch :  12.53 seconds
  patch on top of master branch : 7.073 seconds
That's almost 45% improvement.

2) In List uses binary type [3]
  master branch :  198.668 seconds
patch on top of master branch: 20.37 seconds

Two thoughts:
1. Will code size impact Janino compiler optimization or jvm hotspot
optimization? Otherwise, it seems hard to explain the performance
difference of removing the redundant evaluation. That might imply
that the efficiency of run-time generated code may degrade with
more expressions in the query (?)

2. For In-List filter, it might make sense to create a Drill UDF. The
UDF will build a heap-based hashtable in setup, in a similar way
as what the join approach will do.

 I'm going to open a JIRA to submit the patch for review, as I feel
it will benefit not only the in list filter, but also expressions with
common column references.


[1]
https://mail-archives.apache.org/mod_mbox/drill-user/201508.mbox/%3CCAC-7oTym0Yzr2RmXhDPag6k41se-uTkWu0QC%3DMABb7s94DJ0BA%40mail.gmail.com%3E

[2] https://gist.github.com/jinfengni/7f6df9ed7d2c761fed33

[3]  https://gist.github.com/jinfengni/7460f6d250f0d9ed


[GitHub] drill pull request: Issue DRILL-3736 - fixed syntax of partition b...

2015-09-09 Thread kristinehahn
Github user kristinehahn commented on the pull request:

https://github.com/apache/drill/pull/142#issuecomment-139043772
  
Sorry I can't merge or close this pull request, ^ ("Only those with write 
access . . ."), but I fixed the problem. Please close this pull request. Thanks 
for flagging the mistake.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Resolved] (DRILL-3736) Documentation for partition is misleading/wrong syntax

2015-09-09 Thread Kristine Hahn (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3736?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kristine Hahn resolved DRILL-3736.
--
Resolution: Fixed
  Assignee: Kristine Hahn  (was: Bridget Bevens)

> Documentation for partition is misleading/wrong syntax
> --
>
> Key: DRILL-3736
> URL: https://issues.apache.org/jira/browse/DRILL-3736
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Documentation
>Affects Versions: 1.1.0
> Environment: Any
>Reporter: Edmon Begoli
>Assignee: Kristine Hahn
>Priority: Minor
>  Labels: documentation
> Fix For: 1.2.0
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> Examples on the web site use appropriate syntax (PARTITION BY)
> but the syntax definition uses PARTITION_BY.
> https://drill.apache.org/docs/partition-by-clause/
> It should all be PARTITION BY.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-3754) Remove redundancy in run-time generated code for common column references.

2015-09-09 Thread Jinfeng Ni (JIRA)
Jinfeng Ni created DRILL-3754:
-

 Summary: Remove redundancy in run-time generated code for common 
column references. 
 Key: DRILL-3754
 URL: https://issues.apache.org/jira/browse/DRILL-3754
 Project: Apache Drill
  Issue Type: Bug
  Components: Execution - Codegen
Affects Versions: 1.1.0
Reporter: Jinfeng Ni
Assignee: Jinfeng Ni
 Fix For: 1.2.0


When a operator (Filter, project) has expression which refer one same field 
multiple times, Drill will initialize a value vector and do value holder 
assignment   for each field reference in the run-time generated code.  The 
redundancy might impact the expression evaluation, after the compiled code is 
executed over large number of incoming rows.

This has been seen in recent performance issue reported on the drill user list, 
 where the query contains multiple multiple in list filter conditions. 

In this JIRA, we'll remove the redundancy for the common field reference, so 
that only one initialization and assignment happen in the run-time generated 
code.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] drill pull request: DRILL-3598: use a factory to create the root a...

2015-09-09 Thread cwestin
Github user cwestin commented on the pull request:

https://github.com/apache/drill/pull/104#issuecomment-139064541
  
It looks like you've merged this -- is there any reason not to close the 
pull request?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request: DRILL-3598: use a factory to create the root a...

2015-09-09 Thread adeneche
Github user adeneche commented on the pull request:

https://github.com/apache/drill/pull/104#issuecomment-139065032
  
If we forget to add the "this closes #xxx" to the commit message then the 
only person who can close a PR is the original author. You should close it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request: DRILL-3598: use a factory to create the root a...

2015-09-09 Thread cwestin
Github user cwestin closed the pull request at:

https://github.com/apache/drill/pull/104


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request: DRILL-3598: use a factory to create the root a...

2015-09-09 Thread jaltekruse
Github user jaltekruse commented on the pull request:

https://github.com/apache/drill/pull/104#issuecomment-139066368
  
Sorry about that, I will be making sure I do this in the future.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (DRILL-3755) UserException: throw Validation type of UserException if ValidationException is thrown from Calcite

2015-09-09 Thread Sean Hsuan-Yi Chu (JIRA)
Sean Hsuan-Yi Chu created DRILL-3755:


 Summary: UserException: throw Validation type of UserException if 
ValidationException is thrown from Calcite
 Key: DRILL-3755
 URL: https://issues.apache.org/jira/browse/DRILL-3755
 Project: Apache Drill
  Issue Type: Bug
  Components: Query Planning & Optimization
Reporter: Sean Hsuan-Yi Chu
Assignee: Sean Hsuan-Yi Chu


Currently,  Parse type of UserException is thrown in that situation. 

Meanwhile, some unit tests should be modified too.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] drill pull request: DRILL-1942-concurrency-test: new smoke test fo...

2015-09-09 Thread cwestin
Github user cwestin commented on a diff in the pull request:

https://github.com/apache/drill/pull/105#discussion_r39109156
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/TestTpchDistributedConcurrent.java
 ---
@@ -0,0 +1,199 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill;
+
+import java.io.IOException;
+import java.util.Random;
+import java.util.Set;
+import java.util.concurrent.Semaphore;
+
+import org.apache.drill.QueryTestUtil;
+import org.apache.drill.common.exceptions.UserException;
+import org.apache.drill.common.util.TestTools;
+import org.apache.drill.exec.proto.UserBitShared;
+import org.apache.drill.exec.proto.UserBitShared.QueryResult.QueryState;
+import org.apache.drill.exec.rpc.user.UserResultsListener;
+import org.junit.Rule;
+import org.junit.Test;
+import org.junit.rules.TestRule;
+
+import com.google.common.collect.Sets;
+
+import static org.junit.Assert.assertEquals;
+import static org.junit.Assert.assertTrue;
+import static org.junit.Assert.fail;
+
+/*
+ * Note that the real interest here is that the drillbit doesn't become
+ * unstable from running a lot of queries concurrently -- it's not about
+ * any particular order of execution. We ignore the results.
+ */
+public class TestTpchDistributedConcurrent extends BaseTestQuery {
+  /*
+   * Longer timeout than usual.
+   *
+   * If the test does fail due to a timeout, see the comment in
+   * ChainingResultListener.queryCompleted() before assuming this
+   * needs to be adjusted.
+   */
+  @Rule public final TestRule TIMEOUT = TestTools.getTimeoutRule(12);
+
+  /*
+   * Valid test names taken from TestTpchDistributed. Fuller path prefixes 
are
+   * used so that tests may also be taken from other locations -- more 
variety
+   * is better as far as this test goes.
+   */
+  private final static String queryFile[] = {
+"queries/tpch/01.sql",
+"queries/tpch/03.sql",
+"queries/tpch/04.sql",
+"queries/tpch/05.sql",
+"queries/tpch/06.sql",
+"queries/tpch/07.sql",
+"queries/tpch/08.sql",
+"queries/tpch/09.sql",
+"queries/tpch/10.sql",
+"queries/tpch/11.sql",
+"queries/tpch/12.sql",
+"queries/tpch/13.sql",
+"queries/tpch/14.sql",
+// "queries/tpch/15.sql", this creates a view
+"queries/tpch/16.sql",
+"queries/tpch/18.sql",
+"queries/tpch/19_1.sql",
+"queries/tpch/20.sql",
+  };
+
+  private final static int TOTAL_QUERIES = 115;
+  private final static int CONCURRENT_QUERIES = 15;
+
+  private final static Random random = new Random(0xdeadbeef); // Use the 
same seed each time.
+  private final static String alterSession = "alter session set 
`planner.slice_target` = 10";
+
+  private int remainingQueries = TOTAL_QUERIES - CONCURRENT_QUERIES;
+  private final Semaphore completionSemaphore = new Semaphore(0);
+  private final Semaphore submissionSemaphore = new Semaphore(0);
+  private final Set listeners = 
Sets.newIdentityHashSet();
+
+  private void submitRandomQuery() {
+final String filename = queryFile[random.nextInt(queryFile.length)];
+final String query;
+try {
+  query = QueryTestUtil.normalizeQuery(getFile(filename)).replace(';', 
' ');
+} catch(IOException e) {
+  throw new RuntimeException("Caught exception", e);
+}
+final UserResultsListener listener = new ChainingSilentListener(query);
+client.runQuery(UserBitShared.QueryType.SQL, query, listener);
+synchronized(listeners) {
+  listeners.add(listener);
+}
+  }
+
+  private class ChainingSilentListener extends SilentListener {
+private final String query;
+
+public ChainingSilentListener(final String query) {
+  this.query = query;
+}
+
+@Override
+public void queryCompl

[GitHub] drill pull request: DRILL-1942-concurrency-test: new smoke test fo...

2015-09-09 Thread jaltekruse
Github user jaltekruse commented on a diff in the pull request:

https://github.com/apache/drill/pull/105#discussion_r39111529
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/TestTpchDistributedConcurrent.java
 ---
@@ -0,0 +1,199 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill;
+
+import java.io.IOException;
+import java.util.Random;
+import java.util.Set;
+import java.util.concurrent.Semaphore;
+
+import org.apache.drill.QueryTestUtil;
+import org.apache.drill.common.exceptions.UserException;
+import org.apache.drill.common.util.TestTools;
+import org.apache.drill.exec.proto.UserBitShared;
+import org.apache.drill.exec.proto.UserBitShared.QueryResult.QueryState;
+import org.apache.drill.exec.rpc.user.UserResultsListener;
+import org.junit.Rule;
+import org.junit.Test;
+import org.junit.rules.TestRule;
+
+import com.google.common.collect.Sets;
+
+import static org.junit.Assert.assertEquals;
+import static org.junit.Assert.assertTrue;
+import static org.junit.Assert.fail;
+
+/*
+ * Note that the real interest here is that the drillbit doesn't become
+ * unstable from running a lot of queries concurrently -- it's not about
+ * any particular order of execution. We ignore the results.
+ */
+public class TestTpchDistributedConcurrent extends BaseTestQuery {
+  /*
+   * Longer timeout than usual.
+   *
+   * If the test does fail due to a timeout, see the comment in
+   * ChainingResultListener.queryCompleted() before assuming this
+   * needs to be adjusted.
+   */
+  @Rule public final TestRule TIMEOUT = TestTools.getTimeoutRule(12);
+
+  /*
+   * Valid test names taken from TestTpchDistributed. Fuller path prefixes 
are
+   * used so that tests may also be taken from other locations -- more 
variety
+   * is better as far as this test goes.
+   */
+  private final static String queryFile[] = {
+"queries/tpch/01.sql",
+"queries/tpch/03.sql",
+"queries/tpch/04.sql",
+"queries/tpch/05.sql",
+"queries/tpch/06.sql",
+"queries/tpch/07.sql",
+"queries/tpch/08.sql",
+"queries/tpch/09.sql",
+"queries/tpch/10.sql",
+"queries/tpch/11.sql",
+"queries/tpch/12.sql",
+"queries/tpch/13.sql",
+"queries/tpch/14.sql",
+// "queries/tpch/15.sql", this creates a view
+"queries/tpch/16.sql",
+"queries/tpch/18.sql",
+"queries/tpch/19_1.sql",
+"queries/tpch/20.sql",
+  };
+
+  private final static int TOTAL_QUERIES = 115;
+  private final static int CONCURRENT_QUERIES = 15;
+
+  private final static Random random = new Random(0xdeadbeef); // Use the 
same seed each time.
+  private final static String alterSession = "alter session set 
`planner.slice_target` = 10";
+
+  private int remainingQueries = TOTAL_QUERIES - CONCURRENT_QUERIES;
+  private final Semaphore completionSemaphore = new Semaphore(0);
+  private final Semaphore submissionSemaphore = new Semaphore(0);
+  private final Set listeners = 
Sets.newIdentityHashSet();
+
+  private void submitRandomQuery() {
+final String filename = queryFile[random.nextInt(queryFile.length)];
+final String query;
+try {
+  query = QueryTestUtil.normalizeQuery(getFile(filename)).replace(';', 
' ');
+} catch(IOException e) {
+  throw new RuntimeException("Caught exception", e);
+}
+final UserResultsListener listener = new ChainingSilentListener(query);
+client.runQuery(UserBitShared.QueryType.SQL, query, listener);
+synchronized(listeners) {
+  listeners.add(listener);
+}
+  }
+
+  private class ChainingSilentListener extends SilentListener {
+private final String query;
+
+public ChainingSilentListener(final String query) {
+  this.query = query;
+}
+
+@Override
+public void queryCo

[GitHub] drill pull request: Issue DRILL-3736 - fixed syntax of partition b...

2015-09-09 Thread ebegoli
Github user ebegoli closed the pull request at:

https://github.com/apache/drill/pull/142


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request: Issue DRILL-3736 - fixed syntax of partition b...

2015-09-09 Thread ebegoli
Github user ebegoli commented on the pull request:

https://github.com/apache/drill/pull/142#issuecomment-139078307
  
Kristen fixed it. I am closing pull request.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (DRILL-3756) Consider loosening up the Maven checkstyle audit

2015-09-09 Thread Edmon Begoli (JIRA)
Edmon Begoli created DRILL-3756:
---

 Summary: Consider loosening up the Maven checkstyle audit
 Key: DRILL-3756
 URL: https://issues.apache.org/jira/browse/DRILL-3756
 Project: Apache Drill
  Issue Type: Wish
  Components: Tools, Build & Test
Affects Versions: 1.1.0
 Environment: Maven build on any platform.
Reporter: Edmon Begoli
Assignee: Steven Phillips
Priority: Minor
 Fix For: Future


A space in javadoc before the end of line causes Maven build to fail on 
checkstyle audit.

[INFO] --- maven-checkstyle-plugin:2.12.1:check (checkstyle-validation) @ 
drill-java-exec ---

[INFO] Starting audit...

for example

/drill/exec/java-exec/src/main/java/org/apache/drill/exec/store/StoragePlugin.java:30:
 Line matches the illegal pattern '\s+$'.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Maven build failing on checkstyle

2015-09-09 Thread Edmon Begoli
Ted et al.,

I created an issue as a "wish" for possible loosening up of the checkstyle.

https://issues.apache.org/jira/browse/DRILL-3756

On Wed, Sep 9, 2015 at 2:42 PM, Edmon Begoli  wrote:

> I will do whatever your suggest. I stripped spaces and re-comitted and
> merged into pull request.
>
> I do think that the stylechecker is being bit draconian, but I can live
> with it.
> I was just surprised to see build fail on such a small source commenting
> issue that is hard to detect for those not using IDE, as this vertically
> formatted javadoc is auto-inserted by the tool.
>
> However - I think having checkstyle like this does contribute to better
> code quality.
>
> I would also recommend having a checkstyle looking for something
> meaningful to code quality such as the presence of javadoc comments on
> public mehtods, empty javadoc comments, etc. etc.
>
>
> I will be happy to file JIRA, and you guys decide if it is worth acting on
> it or closing it. I am here to help the project, not to complain :-)
>
>
> On Wed, Sep 9, 2015 at 2:23 PM, Jim Scott  wrote:
>
>> I would suggest not changing the checkstyle and having anyone wanting to
>> commit code back to configure their IDE's to just strip spaces at the ends
>> of lines. Every major IDE has supported this for nearly 10 years.
>>
>> On Wed, Sep 9, 2015 at 1:17 PM, Ted Dunning 
>> wrote:
>>
>> > Checkstyle is clearly being too picky here.
>> >
>> > The only problem with spaces at the end of a line is that some tools
>> strip
>> > them out automagically.  This leads to format changes that make reviews
>> > (very slightly) more difficult.
>> >
>> > I would be willing to fix the checkstyle profile to be less draconian if
>> > you would be willing to file the JIRA.
>> >
>> >
>> >
>> > On Wed, Sep 9, 2015 at 5:14 AM, Edmon Begoli  wrote:
>> >
>> > > and I am sorry to bug you with this but to me, this was a prefectly
>> > > formatted javadoc and I was surprised to see build failing on it:
>> > >
>> > > /** Abstract class for StorePlugin implementations.
>> > >  * See StoragePlugin for description of the interface intent and its
>> > > methods.
>> > >  */
>> > > public abstract class AbstractStoragePlugin implements StoragePlugin{
>> > >   static final org.slf4j.Logger logger =
>> > > org.slf4j.LoggerFactory.getLogger(AbstractStoragePlugin.class);
>> > >
>> > > However, it had a space before the end of the line first line, and
>> > > checkstyle did not like it. I was using vim, not IDE.
>> > >
>> > > I am switching to IDEA ...
>> > >
>> > >
>> > > On Tue, Sep 8, 2015 at 11:48 PM, Edmon Begoli 
>> wrote:
>> > >
>> > > > I am running build on my fork, and Maven build is failing on the
>> > > > checkstyle:
>> > > >
>> > > > excerpt ...
>> > > >
>> > > > [INFO] --- maven-checkstyle-plugin:2.12.1:check
>> > (checkstyle-validation) @
>> > > > drill-java-exec ---
>> > > >
>> > > > [INFO] Starting audit...
>> > > >
>> > > >
>> > >
>> >
>> /Users/ebegoli/drill/exec/java-exec/src/main/java/org/apache/drill/exec/store/AbstractStoragePlugin.java:31:
>> > > > Line matches the illegal pattern '\s+$'.
>> > > >
>> > > >
>> > >
>> >
>> /Users/ebegoli/drill/exec/java-exec/src/main/java/org/apache/drill/exec/store/AbstractStoragePlugin.java:33:
>> > > > Line matches the illegal pattern '\s+$'.
>> > > >
>> > > >
>> > >
>> >
>> /Users/ebegoli/drill/exec/java-exec/src/main/java/org/apache/drill/exec/store/dfs/easy/EasyFormatPlugin.java:118:
>> > > > Line matches the illegal pattern '\s+$'.
>> > > >
>> > > >
>> > >
>> >
>> /Users/ebegoli/drill/exec/java-exec/src/main/java/org/apache/drill/exec/store/StoragePlugin.java:30:
>> > > > Line matches the illegal pattern '\s+$'.
>> > > >
>> > > >
>> > >
>> >
>> /Users/ebegoli/drill/exec/java-exec/src/main/java/org/apache/drill/exec/store/StoragePlugin.java:35:
>> > > > Line matches the illegal pattern '\s+$'.
>> > > >
>> > > >
>> > >
>> >
>> /Users/ebegoli/drill/exec/java-exec/src/main/java/org/apache/drill/exec/store/StoragePlugin.java:44:
>> > > > Line matches the illegal pattern '\s+$'.
>> > > >
>> > > >
>> > >
>> >
>> /Users/ebegoli/drill/exec/java-exec/src/main/java/org/apache/drill/exec/store/StoragePlugin.java:45:
>> > > > Line matches the illegal pattern '\s+$'.
>> > > >
>> > > >
>> > >
>> >
>> /Users/ebegoli/drill/exec/java-exec/src/main/java/org/apache/drill/exec/store/StoragePlugin.java:71:
>> > > > Line matches the illegal pattern '\s+$'.
>> > > >
>> > > >
>> > >
>> >
>> /Users/ebegoli/drill/exec/java-exec/src/main/java/org/apache/drill/exec/store/StoragePlugin.java:74:
>> > > > Line matches the illegal pattern '\s+$'.
>> > > >
>> > > > Audit done.
>> > > >
>> > > > It looks like Javadoc checkstyle if failing. These are included in
>> my
>> > > pull:
>> > > >
>> > > > https://github.com/apache/drill/pull/139
>> > > >
>> > > >
>> > > > Can someone please advise how do I and should I either suppress
>> these
>> > or
>> > > > fix the issue.
>> > > >
>> > > > It is a properly structured javadoc. Starts with /** and ends with
>> */.
>

[jira] [Created] (DRILL-3757) Link to IntelliJ IDEA settings jar on the contributors guidelines page is broken.

2015-09-09 Thread Edmon Begoli (JIRA)
Edmon Begoli created DRILL-3757:
---

 Summary: Link to IntelliJ IDEA settings jar on the contributors 
guidelines page is broken.
 Key: DRILL-3757
 URL: https://issues.apache.org/jira/browse/DRILL-3757
 Project: Apache Drill
  Issue Type: Bug
  Components: Documentation
Affects Versions: 1.1.0
Reporter: Edmon Begoli
Assignee: Bridget Bevens
Priority: Minor
 Fix For: 1.2.0


Link 
https://cwiki.apache.org/confluence/download/attachments/30757399/idea-settings.jar?version=1&modificationDate=1363022308000&api=v

on the page Apache Drill Contribution Guidelines is pointing to missing 
settings jar:
https://drill.apache.org/docs/apache-drill-contribution-guidelines/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: In list filter evaluation : room for improvement in run-time code generation.

2015-09-09 Thread Aman Sinha
Yes, this would be a good enhancement.  Any improvement to the
efficiency/compactness of the generated code is complimentary to other
optimizations such as parquet filter pushdown.  I recall that there was a
JIRA a while ago with hundreds or thousands of filter conditions creating a
really bloated generated code  - we should revisit that at some point to
identify scope for improvement.
I am not so sure about the UDF suggestion in #2.   It seems like
identifying why the large IN-list join approach was slow and fixing that
would be a general solution.

Aman

On Wed, Sep 9, 2015 at 1:31 PM, Jinfeng Ni  wrote:

> Weeks ago there was a message on drill user list, reporting performance
> issues caused by in list filter [1].  The query has filter:
>
> WHERE
>c0 IN (v_00, v_01, v_02, v_03, ... )
> OR
>c1 IN (v_11, v_11, v_12, v_13, )
> OR
>c2 IN ...
> OR
>c3 IN ...
> OR
>
>
> The profile shows that most of query time is spent on filter evaluation.
> One workaround that we recommend was to re-write the query so that the
> planner would convert in list into join operation. Turns out that
> converting
> into join did help improve performance, but not as much as we wanted.
>
> The original query has parquet as the data source. Therefore, the ideal
> solution is parquet filter pushdown, which DRILL-1950 would address.
>
> On the other hand, I noticed that there seems to be room for improvement
> in the run-time generated code. In particular, for " c0 in (v_00, v_01,
> ...)",
> Drill will evaluate it as :
> c0 = v_00  OR c0 = v_01 OR ...
>
> Each reference of "c0" will lead to initialization of vector and holder
> assignment in the generated code. There is redundant evaluation for
> the common reference.
>
> I put together a patch,which will avoid the redundant evaluation for the
> common reference.  Using TPCH scale factor 10's lineitem table, I saw
> quite surprising improvement. (run on Mac with embedded drillbit)
>
> 1) In List uses integer type [2]
>   master branch :  12.53 seconds
>   patch on top of master branch : 7.073 seconds
> That's almost 45% improvement.
>
> 2) In List uses binary type [3]
>   master branch :  198.668 seconds
> patch on top of master branch: 20.37 seconds
>
> Two thoughts:
> 1. Will code size impact Janino compiler optimization or jvm hotspot
> optimization? Otherwise, it seems hard to explain the performance
> difference of removing the redundant evaluation. That might imply
> that the efficiency of run-time generated code may degrade with
> more expressions in the query (?)
>
> 2. For In-List filter, it might make sense to create a Drill UDF. The
> UDF will build a heap-based hashtable in setup, in a similar way
> as what the join approach will do.
>
>  I'm going to open a JIRA to submit the patch for review, as I feel
> it will benefit not only the in list filter, but also expressions with
> common column references.
>
>
> [1]
>
> https://mail-archives.apache.org/mod_mbox/drill-user/201508.mbox/%3CCAC-7oTym0Yzr2RmXhDPag6k41se-uTkWu0QC%3DMABb7s94DJ0BA%40mail.gmail.com%3E
>
> [2] https://gist.github.com/jinfengni/7f6df9ed7d2c761fed33
>
> [3]  https://gist.github.com/jinfengni/7460f6d250f0d9ed
>


IDEA settings jar file is missing on the website - does anyone have one to share?

2015-09-09 Thread Edmon Begoli
Colleagues,

Does anyone have a general IntelliJ IDEA settings jar file to share.

The one listed on the contributors website is broken:
https://drill.apache.org/docs/apache-drill-contribution-guidelines/

I entered issue on JIRA:
https://issues.apache.org/jira/browse/DRILL-3757

Thank you,
Edmon


[GitHub] drill pull request: DRILL-3746: Get Hive partition values from Met...

2015-09-09 Thread vkorukanti
GitHub user vkorukanti opened a pull request:

https://github.com/apache/drill/pull/151

DRILL-3746: Get Hive partition values from MetaStore instead of from …

…parsing the partition location path

1) Added a partition with custom location to test Hive table. Existing 
partition tests now work after the fix.
2) Enabled a test which was disabled previously due to a bug in interpreter 
code

@amansinha100 Could you review the patch?

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/vkorukanti/drill DRILL-3764

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/151.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #151


commit 90169e8ce7f46047631b25f5d248ffdf35788cbf
Author: vkorukanti 
Date:   2015-09-10T00:42:45Z

DRILL-3746: Get Hive partition values from MetaStore instead of from 
parsing the partition location path

1) Added a partition with custom location to test Hive table. Existing 
partition tests now work after the fix.
2) Enabled a test which was disabled previously due to a bug in interpreter 
code




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request: DRILL-1942-hygiene

2015-09-09 Thread jinfengni
Github user jinfengni commented on a diff in the pull request:

https://github.com/apache/drill/pull/120#discussion_r39114713
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/vector/BitVector.java ---
@@ -211,8 +220,11 @@ public TransferPair makeTransferPair(ValueVector to) {
 
   public void transferTo(BitVector target) {
 target.clear();
+if (target.data != null) {
+  target.data.release();
--- End diff --

release(1) ?  

Also, do you intend to change from retain() to retain(1) across the source 
codes? Seems some places are still using retain(). Do a quick search will find 
couple of places, including BitVector.load(), VariableLengthVectors.java:load() 
etc.  

 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request: DRILL-1942-hygiene

2015-09-09 Thread jinfengni
Github user jinfengni commented on the pull request:

https://github.com/apache/drill/pull/120#issuecomment-139083251
  
+1

LGTM.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request: DRILL-3732: Drill leaks memory if external sor...

2015-09-09 Thread amansinha100
Github user amansinha100 commented on a diff in the pull request:

https://github.com/apache/drill/pull/147#discussion_r39115553
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/xsort/ExternalSortBatch.java
 ---
@@ -515,13 +517,23 @@ public BatchGroup 
mergeAndSpill(LinkedList batchGroups) throws Schem
 outputContainer.setRecordCount(count);
 newGroup.addBatch(outputContainer);
   }
+  injector.injectChecked(context.getExecutionControls(), 
INTERRUPTION_WHILE_SPILLING, IOException.class);
   newGroup.closeOutputStream();
+  success = true;
+} catch (IOException e) {
+  throw UserException.resourceError(e)
+.message("External Sort encountered an error while spilling to 
disk")
+.build(logger);
+} finally {
+  // make sure we properly release merged batch groups
   for (BatchGroup group : batchGroupList) {
-group.cleanup();
+AutoCloseables.close(group, logger);
--- End diff --

If there is an exception during this close operation that would get ignored 
... 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request: Update Calcite and Add Test cases

2015-09-09 Thread hsuanyi
Github user hsuanyi closed the pull request at:

https://github.com/apache/drill/pull/150


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request: Update Calcite and Add Test cases

2015-09-09 Thread hsuanyi
GitHub user hsuanyi opened a pull request:

https://github.com/apache/drill/pull/152

Update Calcite and Add Test cases



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/hsuanyi/incubator-drill DRILL-TEST

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/152.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #152


commit a5e22bdd48bcf7624615363d7994a137e652ae15
Author: Hsuan-Yi Chu 
Date:   2015-09-08T17:42:51Z

DRILL-3580: Bump calcite version to 1.4.0-drill-test-r2

commit 12853282abc3d35ae0a7d2b6b630a3d9b9150761
Author: Hsuan-Yi Chu 
Date:   2015-09-08T17:45:51Z

DRILL-3580: Add test case

Fix is in CALCITE-841

commit 513805b193b786d548b4a876d9775fa848b55b8e
Author: Hsuan-Yi Chu 
Date:   2015-08-21T05:33:11Z

DRILL-3412: Add ProjectWindowTransposeRule to push Project past Window

Fix is in CALCITE-844

commit 3648d91233e09203389cd94b5270dbbddc02553f
Author: Hsuan-Yi Chu 
Date:   2015-08-22T00:11:49Z

DRILL-3683: Add baseline and expected plan for TestWindowFunctions suite

commit 2ce5af0debf690ed17f29ba5c0f9c3e91efbe299
Author: Hsuan-Yi Chu 
Date:   2015-09-04T18:35:45Z

DRILL-2190, DRILL-2313, DRILL-2318: Add test cases

Fixes are in CALCITE-634, CALCITE-613, CALCITE-662

commit 759a2853c0713162bcbe85d5e55ca2b86453786c
Author: Hsuan-Yi Chu 
Date:   2015-09-07T23:47:18Z

DRILL-3755: In DrillSqlWorker, give UserException.validationError if 
ValidationException is thrown from Calcite

commit 9e96545c65c6c2c1b8619207a7006ae2391651ac
Author: Hsuan-Yi Chu 
Date:   2015-09-09T23:21:54Z

DRILL-3280, DRILL-3360, DRILL-3601, DRILL-3649: Add test cases

Fix is in CALCITE-820




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Re: In list filter evaluation : room for improvement in run-time code generation.

2015-09-09 Thread Jinfeng Ni
The reason that the in-list join approach is not fast enough :
the query has 5 in-lists ORed together. Each in-list is converted
to a left outer join.  After the 5 left outer join, there is a filter.

Since left outer join does not prune any row from left side,
which is the base table in this case, essentially each join has
to scan the same # of rows as the base table, and copy
to the outgoing batch. That is, although the in-list evaluation
is using hash-based probe, which is faster than the original
filter evaluation, still 5 left out join incurs big overhead
in scanning/copying the data.

The UDF idea in #2 is essentially doing the same kind of hash-based
probe in filter evaluation. The hash-table will be initialized as
a workspace variable in the doSetup(). Then, the doEval() will
simply probe the hash-table.  I feel it would achieve the same
benefit of join approach, while avoid the overhead of re-scanning
the data multiple times.

However, the current infrastructure seems miss the support
of VarArg in Drill's build-in or UDF, which is required to implement
this idea.



On Wed, Sep 9, 2015 at 5:40 PM, Aman Sinha  wrote:

> Yes, this would be a good enhancement.  Any improvement to the
> efficiency/compactness of the generated code is complimentary to other
> optimizations such as parquet filter pushdown.  I recall that there was a
> JIRA a while ago with hundreds or thousands of filter conditions creating a
> really bloated generated code  - we should revisit that at some point to
> identify scope for improvement.
> I am not so sure about the UDF suggestion in #2.   It seems like
> identifying why the large IN-list join approach was slow and fixing that
> would be a general solution.
>
> Aman
>
> On Wed, Sep 9, 2015 at 1:31 PM, Jinfeng Ni  wrote:
>
> > Weeks ago there was a message on drill user list, reporting performance
> > issues caused by in list filter [1].  The query has filter:
> >
> > WHERE
> >c0 IN (v_00, v_01, v_02, v_03, ... )
> > OR
> >c1 IN (v_11, v_11, v_12, v_13, )
> > OR
> >c2 IN ...
> > OR
> >c3 IN ...
> > OR
> >
> >
> > The profile shows that most of query time is spent on filter evaluation.
> > One workaround that we recommend was to re-write the query so that the
> > planner would convert in list into join operation. Turns out that
> > converting
> > into join did help improve performance, but not as much as we wanted.
> >
> > The original query has parquet as the data source. Therefore, the ideal
> > solution is parquet filter pushdown, which DRILL-1950 would address.
> >
> > On the other hand, I noticed that there seems to be room for improvement
> > in the run-time generated code. In particular, for " c0 in (v_00, v_01,
> > ...)",
> > Drill will evaluate it as :
> > c0 = v_00  OR c0 = v_01 OR ...
> >
> > Each reference of "c0" will lead to initialization of vector and holder
> > assignment in the generated code. There is redundant evaluation for
> > the common reference.
> >
> > I put together a patch,which will avoid the redundant evaluation for the
> > common reference.  Using TPCH scale factor 10's lineitem table, I saw
> > quite surprising improvement. (run on Mac with embedded drillbit)
> >
> > 1) In List uses integer type [2]
> >   master branch :  12.53 seconds
> >   patch on top of master branch : 7.073 seconds
> > That's almost 45% improvement.
> >
> > 2) In List uses binary type [3]
> >   master branch :  198.668 seconds
> > patch on top of master branch: 20.37 seconds
> >
> > Two thoughts:
> > 1. Will code size impact Janino compiler optimization or jvm hotspot
> > optimization? Otherwise, it seems hard to explain the performance
> > difference of removing the redundant evaluation. That might imply
> > that the efficiency of run-time generated code may degrade with
> > more expressions in the query (?)
> >
> > 2. For In-List filter, it might make sense to create a Drill UDF. The
> > UDF will build a heap-based hashtable in setup, in a similar way
> > as what the join approach will do.
> >
> >  I'm going to open a JIRA to submit the patch for review, as I feel
> > it will benefit not only the in list filter, but also expressions with
> > common column references.
> >
> >
> > [1]
> >
> >
> https://mail-archives.apache.org/mod_mbox/drill-user/201508.mbox/%3CCAC-7oTym0Yzr2RmXhDPag6k41se-uTkWu0QC%3DMABb7s94DJ0BA%40mail.gmail.com%3E
> >
> > [2] https://gist.github.com/jinfengni/7f6df9ed7d2c761fed33
> >
> > [3]  https://gist.github.com/jinfengni/7460f6d250f0d9ed
> >
>


Re: Maven build failing on checkstyle

2015-09-09 Thread Jacques Nadeau
Hey Ted,

FYI, we added trailing spaces check on purpose.  Please open a discussion
rather than making a random decision. If anything our checkstyle is far too
lenient which has led to poor consistency and missing comments.
On Sep 9, 2015 11:18 AM, "Ted Dunning"  wrote:

> Checkstyle is clearly being too picky here.
>
> The only problem with spaces at the end of a line is that some tools strip
> them out automagically.  This leads to format changes that make reviews
> (very slightly) more difficult.
>
> I would be willing to fix the checkstyle profile to be less draconian if
> you would be willing to file the JIRA.
>
>
>
> On Wed, Sep 9, 2015 at 5:14 AM, Edmon Begoli  wrote:
>
> > and I am sorry to bug you with this but to me, this was a prefectly
> > formatted javadoc and I was surprised to see build failing on it:
> >
> > /** Abstract class for StorePlugin implementations.
> >  * See StoragePlugin for description of the interface intent and its
> > methods.
> >  */
> > public abstract class AbstractStoragePlugin implements StoragePlugin{
> >   static final org.slf4j.Logger logger =
> > org.slf4j.LoggerFactory.getLogger(AbstractStoragePlugin.class);
> >
> > However, it had a space before the end of the line first line, and
> > checkstyle did not like it. I was using vim, not IDE.
> >
> > I am switching to IDEA ...
> >
> >
> > On Tue, Sep 8, 2015 at 11:48 PM, Edmon Begoli  wrote:
> >
> > > I am running build on my fork, and Maven build is failing on the
> > > checkstyle:
> > >
> > > excerpt ...
> > >
> > > [INFO] --- maven-checkstyle-plugin:2.12.1:check
> (checkstyle-validation) @
> > > drill-java-exec ---
> > >
> > > [INFO] Starting audit...
> > >
> > >
> >
> /Users/ebegoli/drill/exec/java-exec/src/main/java/org/apache/drill/exec/store/AbstractStoragePlugin.java:31:
> > > Line matches the illegal pattern '\s+$'.
> > >
> > >
> >
> /Users/ebegoli/drill/exec/java-exec/src/main/java/org/apache/drill/exec/store/AbstractStoragePlugin.java:33:
> > > Line matches the illegal pattern '\s+$'.
> > >
> > >
> >
> /Users/ebegoli/drill/exec/java-exec/src/main/java/org/apache/drill/exec/store/dfs/easy/EasyFormatPlugin.java:118:
> > > Line matches the illegal pattern '\s+$'.
> > >
> > >
> >
> /Users/ebegoli/drill/exec/java-exec/src/main/java/org/apache/drill/exec/store/StoragePlugin.java:30:
> > > Line matches the illegal pattern '\s+$'.
> > >
> > >
> >
> /Users/ebegoli/drill/exec/java-exec/src/main/java/org/apache/drill/exec/store/StoragePlugin.java:35:
> > > Line matches the illegal pattern '\s+$'.
> > >
> > >
> >
> /Users/ebegoli/drill/exec/java-exec/src/main/java/org/apache/drill/exec/store/StoragePlugin.java:44:
> > > Line matches the illegal pattern '\s+$'.
> > >
> > >
> >
> /Users/ebegoli/drill/exec/java-exec/src/main/java/org/apache/drill/exec/store/StoragePlugin.java:45:
> > > Line matches the illegal pattern '\s+$'.
> > >
> > >
> >
> /Users/ebegoli/drill/exec/java-exec/src/main/java/org/apache/drill/exec/store/StoragePlugin.java:71:
> > > Line matches the illegal pattern '\s+$'.
> > >
> > >
> >
> /Users/ebegoli/drill/exec/java-exec/src/main/java/org/apache/drill/exec/store/StoragePlugin.java:74:
> > > Line matches the illegal pattern '\s+$'.
> > >
> > > Audit done.
> > >
> > > It looks like Javadoc checkstyle if failing. These are included in my
> > pull:
> > >
> > > https://github.com/apache/drill/pull/139
> > >
> > >
> > > Can someone please advise how do I and should I either suppress these
> or
> > > fix the issue.
> > >
> > > It is a properly structured javadoc. Starts with /** and ends with */.
> > >
> > > Not sure what else is required, but I will happy to fix it to make it
> > pass
> > > the checkstyle.
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> >
>


Drill - Error creating views

2015-09-09 Thread Sudip Mukherjee
Hi Devs,

I am getting the below exception while trying to create view from Web UI. Could 
you please take a look on how to troubleshoot this?

org.apache.drill.common.exceptions.UserRemoteException: PARSE ERROR: 
java.lang.IndexOutOfBoundsException: index (43) must be less than size (43)


QUERY that I gave is below :

CREATE OR REPLACE VIEW dfs.myviews.downloadcenterdataview as SELECT 
`bollink`,`category`,`categoryname`,`categoryid`,`downloadtype`,`earlypreviewusers`,`installpreference`,`notificationcontent`,`notificationusers`,`notvisibleto`,`packagedescription`,`packageid`,`packageplatformmappingid`,`packagelocation`,`packagename`,`packagesize`,`packagestatus`,`platform`,`platformname`,`productversion`,`productversionname`,`readmelocation`,`recutnumber`,`subcategory`,`subcategoryid`,`subcategoryname`,`validfrom`,`validto`,`vendor`,`visibleto`,`bolcontent`,`readmecontent`,`content`,`createtime`,`modifiedtime`,`serverid`,`softwareicon`,`reportname`,`reportdescription`,`reportguid`,`reportrevision`,`reportformat`,`includeTable`,`includeChart`,`itemrank`,`priceweightage`
 from mydb.downloadcenterdata

Stack trace :

2015-09-10 01:09:55,109 [qtp727140336-1839] ERROR 
o.a.d.e.server.rest.QueryResources - Query from Web UI Failed
org.apache.drill.common.exceptions.UserRemoteException: PARSE ERROR: 
java.lang.IndexOutOfBoundsException: index (43) must be less than size (43)

at 
org.apache.drill.exec.rpc.user.QueryResultHandler.resultArrived(QueryResultHandler.java:118)
 ~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
at 
org.apache.drill.exec.rpc.user.UserClient.handleReponse(UserClient.java:111) 
~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
at 
org.apache.drill.exec.rpc.BasicClientWithConnection.handle(BasicClientWithConnection.java:47)
 ~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
at 
org.apache.drill.exec.rpc.BasicClientWithConnection.handle(BasicClientWithConnection.java:32)
 ~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
at org.apache.drill.exec.rpc.RpcBus.handle(RpcBus.java:61) 
~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
at 
org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:233) 
~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
at 
org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:205) 
~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
at 
io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:89)
 ~[netty-codec-4.0.27.Final.jar:4.0.27.Final]

Thanks,
Sudip



***Legal Disclaimer***
"This communication may contain confidential and privileged material for the
sole use of the intended recipient. Any unauthorized review, use or distribution
by others is strictly prohibited. If you have received the message by mistake,
please advise the sender by reply email and delete the message. Thank you."
**

Re: Directory and file based partition pruning

2015-09-09 Thread Jacques Nadeau
Makes sense.

Is there we can do this with lazy materializations rather than writing
complex expression tree logic? I hate have no all this custom expression
tree manipulation logic.

Also, it seems like this should be N phased rather than two phase where N
is the number of directories below the base path.

Thoughts?
On Sep 9, 2015 10:54 AM, "Aman Sinha"  wrote:

> Currently, partition pruning gets all file names in the table and applies
> the pruning.  Suppose the files are spread out over several directories and
> there is a filter  on dirN,  this is not efficient - both in terms of
> elapsed time and memory usage.  This has been seen in a few use cases
> recently.
>
> We should ideally perform the pruning in 2 steps:  first get the top-level
> directory names only and apply the directory filter, then get the filenames
> within that directory and apply remaining filters.
>
> I will create a JIRA for this enhancement but let me know your thoughts...
>
> Aman
>