[jira] [Commented] (DRILL-6373) Refactor the Result Set Loader to prepare for Union, List support

2018-06-11 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16509212#comment-16509212
 ] 

ASF GitHub Bot commented on DRILL-6373:
---

vrozov commented on issue #1244: DRILL-6373: Refactor Result Set Loader for 
Union, List support
URL: https://github.com/apache/drill/pull/1244#issuecomment-396471425
 
 
   @paul-rogers I did not check the proposed solution, there may be bugs in it, 
but the main point is that when `PartitionSender` creates new vectors it should 
not mutate `MaterializedField` of the `incoming` batch vectors. It should be 
possible to construct a new vector based on an existing vector including two, 
three or more levels deep maps or other not flat vectors. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Refactor the Result Set Loader to prepare for Union, List support
> -
>
> Key: DRILL-6373
> URL: https://issues.apache.org/jira/browse/DRILL-6373
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: 1.13.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
>Priority: Major
> Fix For: 1.14.0
>
>
> As the next step in merging the "batch sizing" enhancements, refactor the 
> {{ResultSetLoader}} and related classes to prepare for Union and List 
> support. This fix follows the refactoring of the column accessors for the 
> same purpose. Actual Union and List support is to follow in a separate PR.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-4364) Image Metadata Format Plugin

2018-06-11 Thread Akihiko Kusanagi (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-4364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16509183#comment-16509183
 ] 

Akihiko Kusanagi commented on DRILL-4364:
-

Thanks [~bbevens]. This is a minor issue but please change the style of the 
line 'Retrieving the images larger than 640 x 480 pixels' in the Examples 
section to distinguish it from code blocks?

> Image Metadata Format Plugin
> 
>
> Key: DRILL-4364
> URL: https://issues.apache.org/jira/browse/DRILL-4364
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Storage - Other
>Reporter: Akihiko Kusanagi
>Assignee: Akihiko Kusanagi
>Priority: Major
>  Labels: doc-complete, ready-to-commit
> Fix For: 1.14.0
>
>
> Support querying of metadata in various image formats. This plugin leverages 
> [metadata-extractor|https://github.com/drewnoakes/metadata-extractor]. This 
> plugin is especially useful when querying on a large number of image files 
> stored in a distributed file system without building metadata repository in 
> advance.
> This plugin supports the following file formats.
>  * JPEG, TIFF, PSD, PNG, BMP, GIF, ICO, PCX, WAV, AVI, WebP, MOV, MP4, EPS
>  * Camera Raw: ARW (Sony), CRW/CR2 (Canon), NEF (Nikon), ORF (Olympus), RAF 
> (FujiFilm), RW2 (Panasonic), RWL (Leica), SRW (Samsung), X3F (Foveon)
> This plugin enables to read the following metadata.
>  * Exif, IPTC, XMP, JFIF / JFXX, ICC Profiles, Photoshop fields, PNG 
> properties, BMP properties, GIF properties, ICO properties, PCX properties, 
> WAV properties, AVI properties, WebP properties, QuickTime properties, MP4 
> properties, EPS properties
> Since each type of metadata has a different set of fields, the plugin returns 
> a set of commonly-used fields such as the image width, height and bits per 
> pixels for ease of use.
> *Examples:*
> Querying on a JPEG file with the property descriptive: true
> {noformat}
> 0: jdbc:drill:zk=local> select FileName, * from 
> dfs.`4349313028_f69ffa0257_o.jpg`;
> +--+--+--+++-+--+--+---++---+--+--++---++-+-+--+--+--++--+-+---+---+--+-+--+
> | FileName | FileSize | FileDateTime | Format | PixelWidth | PixelHeight | 
> BitsPerPixel | DPIWidth | DPIHeight | Orientaion | ColorMode | HasAlpha | 
> Duration | VideoCodec | FrameRate | AudioCodec | AudioSampleSize | 
> AudioSampleRate | JPEG | JFIF | ExifIFD0 | ExifSubIFD | Interoperability | 
> GPS | ExifThumbnail | Photoshop | IPTC | Huffman | FileType |
> +--+--+--+++-+--+--+---++---+--+--++---++-+-+--+--+--++--+-+---+---+--+-+--+
> | 4349313028_f69ffa0257_o.jpg | 257213 bytes | Fri Mar 09 12:09:34 +08:00 
> 2018 | JPEG | 1199 | 800 | 24 | 96 | 96 | Unknown (0) | RGB | false | 
> 00:00:00 | Unknown | 0 | Unknown | 0 | 0 | 
> {"CompressionType":"Baseline","DataPrecision":"8 bits","ImageHeight":"800 
> pixels","ImageWidth":"1199 pixels","NumberOfComponents":"3","Component1":"Y 
> component: Quantization table 0, Sampling factors 2 horiz/2 
> vert","Component2":"Cb component: Quantization table 1, Sampling factors 1 
> horiz/1 vert","Component3":"Cr component: Quantization table 1, Sampling 
> factors 1 horiz/1 vert"} | 
> {"Version":"1.1","ResolutionUnits":"inch","XResolution":"96 
> dots","YResolution":"96 
> dots","ThumbnailWidthPixels":"0","ThumbnailHeightPixels":"0"} | 
> {"Software":"Picasa 3.0"} | 
> {"ExifVersion":"2.10","UniqueImageID":"d65e93b836d15a0c5e041e6b7258c76e"} | 
> {"InteroperabilityIndex":"Unknown ()","InteroperabilityVersion":"1.00"} | 
> {"GPSVersionID":".022","GPSLatitudeRef":"N","GPSLatitude":"47° 32' 
> 15.98\"","GPSLongitudeRef":"W","GPSLongitude":"-122° 2' 
> 6.37\"","GPSAltitudeRef":"Sea level","GPSAltitude":"0 metres"} | 
> {"Compression":"JPEG (old-style)","XResolution":"72 dots per 
> inch","YResolution":"72 dots per 
> inch","ResolutionUnit":"Inch","ThumbnailOffset":"414 
> bytes","ThumbnailLength":"7213 bytes"} | {} | 
> {"Keywords":"135;2002;issaquah;police car;wa;washington"} | 
> {"NumberOfTables":"4 Huffman tables"} | 
> {"DetectedFileTypeName":"JPEG","DetectedFileTypeLongName":"Joint Photographic 
> Experts 
> Group","DetectedMIMEType":"image/jpeg","ExpectedFileNameExtension":"jpg"} |
> 

[jira] [Commented] (DRILL-6373) Refactor the Result Set Loader to prepare for Union, List support

2018-06-11 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16509171#comment-16509171
 ] 

ASF GitHub Bot commented on DRILL-6373:
---

paul-rogers commented on issue #1244: DRILL-6373: Refactor Result Set Loader 
for Union, List support
URL: https://github.com/apache/drill/pull/1244#issuecomment-396462788
 
 
   @vrozov, by the way thanks for taking the time to work on this -- a huge 
help since I can't run the functional tests myself.
   
   If all this seems overly complex, I agree that it is. I spent many hours 
working out all these twists and turns. The original design is broken and was 
never cleanly revised once maps were added.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Refactor the Result Set Loader to prepare for Union, List support
> -
>
> Key: DRILL-6373
> URL: https://issues.apache.org/jira/browse/DRILL-6373
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: 1.13.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
>Priority: Major
> Fix For: 1.14.0
>
>
> As the next step in merging the "batch sizing" enhancements, refactor the 
> {{ResultSetLoader}} and related classes to prepare for Union and List 
> support. This fix follows the refactoring of the column accessors for the 
> same purpose. Actual Union and List support is to follow in a separate PR.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6373) Refactor the Result Set Loader to prepare for Union, List support

2018-06-11 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16509164#comment-16509164
 ] 

ASF GitHub Bot commented on DRILL-6373:
---

paul-rogers commented on issue #1244: DRILL-6373: Refactor Result Set Loader 
for Union, List support
URL: https://github.com/apache/drill/pull/1244#issuecomment-396462584
 
 
   @vrozov, your proposed solution works for the single-level map case. It does 
*not* work for the two-level map case as the `MaterializedField` used inside 
the inner map is not the one passed from the outer map. Hacky code would be 
needed to pass in the `MaterializedField`, then fetch it to add it to the outer 
map's `MaterializedField`.
   
   And, even the above hack does not work for three-level maps.
   
   So, what else can we do? We can note one additional problem with maps. If we 
simply take a map field `m` from batch B1 and use it to create a map in batch 
B2, we have a dilemma. The `m` `MaterializedField` already contains "child" 
entries for `a` and `b`, say.
   
   A similar issue exists for any vector with structure: VarChar (offsets and 
data), Nullable (bits and data), etc.
   
   So, what to do? The new code, of which this PR is a part, faced similar 
difficulties with the additional complexity of the `ResultSetLoader` which must 
clone a vector to produce the "overflow" vector. (This is the reason for fixing 
the Nullable vectors so that the data vector has its type changed to Required, 
so that the vector clone works correctly.)
   
   So, we get our solution. Clone the `MaterializedField` in the 
`PartitionSender` *before* calling `BasicTypeHelper.getNewVector`.
   
   We might thing that `getNewVector()` can do the clone. But, it is used by 
`ResultSetLoader` assuming that the new vector will use the `MaterializedField` 
provided. And, in the vast number of cases, we don't need a clone.
   
   It is only when using one vector to create another that we need the clone, 
so it must be done in the caller. In this case, that is `PartitionSender`.
   
   Does this make sense? Maybe the easiest thing is to just make the change and 
do a test run.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Refactor the Result Set Loader to prepare for Union, List support
> -
>
> Key: DRILL-6373
> URL: https://issues.apache.org/jira/browse/DRILL-6373
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: 1.13.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
>Priority: Major
> Fix For: 1.14.0
>
>
> As the next step in merging the "batch sizing" enhancements, refactor the 
> {{ResultSetLoader}} and related classes to prepare for Union and List 
> support. This fix follows the refactoring of the column accessors for the 
> same purpose. Actual Union and List support is to follow in a separate PR.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6373) Refactor the Result Set Loader to prepare for Union, List support

2018-06-11 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16509163#comment-16509163
 ] 

ASF GitHub Bot commented on DRILL-6373:
---

paul-rogers commented on issue #1244: DRILL-6373: Refactor Result Set Loader 
for Union, List support
URL: https://github.com/apache/drill/pull/1244#issuecomment-396462584
 
 
   @vrozov, your proposed solution works for the single-level map case. It does 
*not* work for the two-level map case as the `MaterializedField` used inside 
the inner map is not the one passed from the outer map. Hacky code would be 
needed to pass in the `MaterializedField`, then fetch it to add it to the outer 
map's `MaterializedField`.
   
   And, even the above hack does not work for three-level maps.
   
   So, what else can we do? We can note one additional problem with maps. If we 
simply take a map field `m` from batch B1 and use it to create a map in batch 
B2, we have a dilemma. The `m` `MaterializedField` already contains "child" 
entries for `a` and `b`, say.
   
   A similar issue exists for any vector with structure: VarChar (offsets and 
data), Nullable (bits and data), etc.
   
   So, what to do? The new code, of which this PR is a part, faced similar 
difficulties with the additional complexity of the `ResultSetLoader` which must 
clone a vector to produce the "overflow" vector. (This is the reason for fixing 
the Nullable vectors so that the data vector has its type changed to Required, 
so that the vector clone works correctly.)
   
   (If all this seems overly complex, I agree that it is. I spent many hours 
working out all these twists and turns. The original design is broken and was 
never cleanly revised once maps were added.)
   
   So, we get our solution. Clone the `MaterializedField` in the 
`PartitionSender` *before* calling `BasicTypeHelper.getNewVector`.
   
   We might thing that `getNewVector()` can do the clone. But, it is used by 
`ResultSetLoader` assuming that the new vector will use the `MaterializedField` 
provided. And, in the vast number of cases, we don't need a clone.
   
   It is only when using one vector to create another that we need the clone, 
so it must be done in the caller. In this case, that is `PartitionSender`.
   
   Does this make sense? Maybe the easiest thing is to just make the change and 
do a test run.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Refactor the Result Set Loader to prepare for Union, List support
> -
>
> Key: DRILL-6373
> URL: https://issues.apache.org/jira/browse/DRILL-6373
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: 1.13.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
>Priority: Major
> Fix For: 1.14.0
>
>
> As the next step in merging the "batch sizing" enhancements, refactor the 
> {{ResultSetLoader}} and related classes to prepare for Union and List 
> support. This fix follows the refactoring of the column accessors for the 
> same purpose. Actual Union and List support is to follow in a separate PR.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6373) Refactor the Result Set Loader to prepare for Union, List support

2018-06-11 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16509153#comment-16509153
 ] 

ASF GitHub Bot commented on DRILL-6373:
---

paul-rogers commented on issue #1244: DRILL-6373: Refactor Result Set Loader 
for Union, List support
URL: https://github.com/apache/drill/pull/1244#issuecomment-396460011
 
 
   @vrozov, this is what I meant by the code only working well for a flat 
vector. Maps have a `MaterializedField` for their own metadata. That metadata 
contains a list of the child metadata. So:
   
   ```
   m:Map, {a:Int, b:Varchar}
   . a:INT
   . b:Varchar
   ```
   
   This means, as we add `a`, then, `b` to map `m`, we must mutate the 
`MaterializedField` for `m` to add the child fields. This completely breaks the 
idea that a `MaterializedField` is always immutable. This is a place where a 
design goal (immutable `MateraializedField`) collides with implementation 
(readers add to maps as they find new fields).
   
   So. We could make a clone on each modification and problem solved, right? As 
in many places in Drill, the solution is not so simple. The above is the easy 
case. What about a nested map:
   
   ```
   m1:Map, {m2:Map {a:Int, b:Varchar}}
   . m2:Map, {a:Int, b:Varchar}
   . . a:INT
   . . b:Varchar
   ```
   
   Now when we add `a`, we have to update both the `m1` and `m2` 
`MaterializedField`s. If we clone the MaterializedField for `m2`, then the old 
version in `m1` will get out of sync. The result will be:
   
   ```
   m1:Map, {m2:Map {}}
   . m2:Map, {a:Int, b:Varchar}
   . . a:INT
   . . b:Varchar
   ```
   
   Code that depends on accurate schema information then breaks.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Refactor the Result Set Loader to prepare for Union, List support
> -
>
> Key: DRILL-6373
> URL: https://issues.apache.org/jira/browse/DRILL-6373
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: 1.13.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
>Priority: Major
> Fix For: 1.14.0
>
>
> As the next step in merging the "batch sizing" enhancements, refactor the 
> {{ResultSetLoader}} and related classes to prepare for Union and List 
> support. This fix follows the refactoring of the column accessors for the 
> same purpose. Actual Union and List support is to follow in a separate PR.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6373) Refactor the Result Set Loader to prepare for Union, List support

2018-06-11 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16509147#comment-16509147
 ] 

ASF GitHub Bot commented on DRILL-6373:
---

paul-rogers commented on issue #1244: DRILL-6373: Refactor Result Set Loader 
for Union, List support
URL: https://github.com/apache/drill/pull/1244#issuecomment-396460011
 
 
   @vrozov, this is what I meant by the code only working well for a flat 
vector. Maps have a `MaterializedField` for their own metadata. That metadata 
contains a list of the child metadata. So:
   
   ```
   m:Map, {a:Int, b:Varchar}
   . a:INT
   . b:Varchar
   ```
   
   This means, as we add `a`, then, `b` to map `m`, we must mutate the 
`MaterializedField` for `m` to add the child fields. This completely breaks the 
idea that a `MaterializedField` is always immutable. This is a place where a 
design goal (immutable `MateraializedField`) collides with implementation 
(readers add to maps as they find new fields).
   
   So. We could make a clone on each modification and problem solved, right? As 
in many places in Drill, the solution is not so simple. The above is the easy 
case. What about a nested map:
   
   ```
   m1:Map, {m2:Map {a:Int, b:Varchar}}
   . m2:Map, {a:Int, b:Varchar}
   . . a:INT
   . . b:Varchar
   ```
   
   Now when we add `a`, we have to update both the `m1` and `m2` 
`MaterializedField`s. If we clone the MaterializedField for `m2`, then the old 
version in `m1` will get out of sync. The result will be:
   
   ```
   m1:Map, {m2:Map {}}
   . m2:Map, {a:Int, b:Varchar}
   . . a:INT
   . . b:Varchar
   ```
   
   Code that depends on accurate schema information then breaks.
   
   So, we're left with a choice: clone and have a corrupt schema, or make 
`MaterializedField` mutable.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Refactor the Result Set Loader to prepare for Union, List support
> -
>
> Key: DRILL-6373
> URL: https://issues.apache.org/jira/browse/DRILL-6373
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: 1.13.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
>Priority: Major
> Fix For: 1.14.0
>
>
> As the next step in merging the "batch sizing" enhancements, refactor the 
> {{ResultSetLoader}} and related classes to prepare for Union and List 
> support. This fix follows the refactoring of the column accessors for the 
> same purpose. Actual Union and List support is to follow in a separate PR.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-5735) UI options grouping and filtering & Metrics hints

2018-06-11 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-5735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16509141#comment-16509141
 ] 

ASF GitHub Bot commented on DRILL-5735:
---

ilooner commented on issue #1279: DRILL-5735: Allow search/sort in the Options 
webUI
URL: https://github.com/apache/drill/pull/1279#issuecomment-396458427
 
 
   Since it looks like reviewers have requested additional work to be done on 
this change, I have removed the ready-to-commit label from the Jira.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> UI options grouping and filtering & Metrics hints
> -
>
> Key: DRILL-5735
> URL: https://issues.apache.org/jira/browse/DRILL-5735
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Web Server
>Affects Versions: 1.9.0, 1.10.0, 1.11.0
>Reporter: Muhammad Gelbana
>Assignee: Kunal Khatua
>Priority: Major
> Fix For: 1.14.0
>
>
> I'm thinking of some UI improvements that could make all the difference for 
> users trying to optimize low-performing queries.
> h2. Options
> h3. Grouping
> We can organize the options to be grouped by their scope of effect, this will 
> help users easily locate the options they may need to tune.
> h3. Filtering
> Since the options are a lot, we can add a filtering mechanism (i.e. string 
> search or group\scope filtering) so the user can filter out the options he's 
> not interested in. To provide more benefit than the grouping idea mentioned 
> above, filtering may include keywords also and not just the option name, 
> since the user may not be aware of the name of the option he's looking for.
> h2. Metrics
> I'm referring here to the metrics page and the query execution plan page that 
> displays the overview section and major\minor fragments metrics. We can show 
> hints for each metric such as:
> # What does it represent in more details.
> # What option\scope-of-options to tune (increase ? decrease ?) to improve the 
> performance reported by this metric.
> # May be even provide a small dialog to quickly allow the modification of the 
> related option(s) to that metric



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6486) BitVector split and transfer does not work correctly for non byte-multiple transfer lengths

2018-06-11 Thread Paul Rogers (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16509134#comment-16509134
 ] 

Paul Rogers commented on DRILL-6486:


Rather than muck with BitVector, consider swapping it out for UInt1 vector. 
That work was done everywhere else in Drill.

> BitVector split and transfer does not work correctly for non byte-multiple 
> transfer lengths
> ---
>
> Key: DRILL-6486
> URL: https://issues.apache.org/jira/browse/DRILL-6486
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Data Types
>Affects Versions: 1.13.0
>Reporter: Karthikeyan Manivannan
>Assignee: Karthikeyan Manivannan
>Priority: Major
> Fix For: 1.14.0
>
> Attachments: TestSplitAndTransfer.java
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> BitVector splitAndTransfer does not correctly handle transfers where the 
> transfer-length is not a multiple of 8. The attached bitVector tests will 
> expose this problem. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-5735) UI options grouping and filtering & Metrics hints

2018-06-11 Thread Timothy Farkas (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-5735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timothy Farkas updated DRILL-5735:
--
Labels:   (was: ready-to-commit)

> UI options grouping and filtering & Metrics hints
> -
>
> Key: DRILL-5735
> URL: https://issues.apache.org/jira/browse/DRILL-5735
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Web Server
>Affects Versions: 1.9.0, 1.10.0, 1.11.0
>Reporter: Muhammad Gelbana
>Assignee: Kunal Khatua
>Priority: Major
> Fix For: 1.14.0
>
>
> I'm thinking of some UI improvements that could make all the difference for 
> users trying to optimize low-performing queries.
> h2. Options
> h3. Grouping
> We can organize the options to be grouped by their scope of effect, this will 
> help users easily locate the options they may need to tune.
> h3. Filtering
> Since the options are a lot, we can add a filtering mechanism (i.e. string 
> search or group\scope filtering) so the user can filter out the options he's 
> not interested in. To provide more benefit than the grouping idea mentioned 
> above, filtering may include keywords also and not just the option name, 
> since the user may not be aware of the name of the option he's looking for.
> h2. Metrics
> I'm referring here to the metrics page and the query execution plan page that 
> displays the overview section and major\minor fragments metrics. We can show 
> hints for each metric such as:
> # What does it represent in more details.
> # What option\scope-of-options to tune (increase ? decrease ?) to improve the 
> performance reported by this metric.
> # May be even provide a small dialog to quickly allow the modification of the 
> related option(s) to that metric



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6488) Drill native client - compile error due to usage of "template inline"

2018-06-11 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16509126#comment-16509126
 ] 

ASF GitHub Bot commented on DRILL-6488:
---

priteshm commented on issue #1317: DRILL-6488 - change instances of "template 
inline" to just "template"
URL: https://github.com/apache/drill/pull/1317#issuecomment-396456748
 
 
   @parthchandra could you please review this?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Drill native client - compile error due to usage of "template inline"
> -
>
> Key: DRILL-6488
> URL: https://issues.apache.org/jira/browse/DRILL-6488
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Patrick Wong
>Assignee: Patrick Wong
>Priority: Major
> Fix For: 1.14.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6481) Refactor ParquetXXXPredicate classes

2018-06-11 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16509124#comment-16509124
 ] 

ASF GitHub Bot commented on DRILL-6481:
---

ilooner closed pull request #1312:  DRILL-6481: Refactor ParquetXXXPredicate 
classes
URL: https://github.com/apache/drill/pull/1312
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git 
a/exec/java-exec/src/main/java/org/apache/drill/exec/expr/stat/ParquetBooleanPredicate.java
 
b/exec/java-exec/src/main/java/org/apache/drill/exec/expr/stat/ParquetBooleanPredicate.java
new file mode 100644
index 00..fa5c4672a8
--- /dev/null
+++ 
b/exec/java-exec/src/main/java/org/apache/drill/exec/expr/stat/ParquetBooleanPredicate.java
@@ -0,0 +1,98 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.expr.stat;
+
+import org.apache.drill.common.expression.BooleanOperator;
+import org.apache.drill.common.expression.ExpressionPosition;
+import org.apache.drill.common.expression.LogicalExpression;
+import org.apache.drill.common.expression.visitors.ExprVisitor;
+
+import java.util.List;
+
+/**
+ * Boolean predicates for parquet filter pushdown.
+ */
+public abstract class ParquetBooleanPredicate> extends 
BooleanOperator
+implements ParquetFilterPredicate {
+
+  private ParquetBooleanPredicate(String name, List args, 
ExpressionPosition pos) {
+super(name, args, pos);
+  }
+
+  @Override
+  public  T accept(ExprVisitor visitor, V 
value) throws E {
+return visitor.visitBooleanOperator(this, value);
+  }
+
+  @SuppressWarnings("unchecked")
+  private static > LogicalExpression 
createAndPredicate(
+  String name,
+  List args,
+  ExpressionPosition pos
+  ) {
+return new ParquetBooleanPredicate(name, args, pos) {
+  @Override
+  public boolean canDrop(RangeExprEvaluator evaluator) {
+// "and" : as long as one branch is OK to drop, we can drop it.
+for (LogicalExpression child : this) {
+  if (child instanceof ParquetFilterPredicate && 
((ParquetFilterPredicate)child).canDrop(evaluator)) {
+return true;
+  }
+}
+return false;
+  }
+};
+  }
+
+  @SuppressWarnings("unchecked")
+  private static > LogicalExpression createOrPredicate(
+  String name,
+  List args,
+  ExpressionPosition pos
+  ) {
+return new ParquetBooleanPredicate(name, args, pos) {
+  @Override
+  public boolean canDrop(RangeExprEvaluator evaluator) {
+for (LogicalExpression child : this) {
+  // "or" : as long as one branch is NOT ok to drop, we can NOT drop 
it.
+  if (!(child instanceof ParquetFilterPredicate) || 
!((ParquetFilterPredicate)child).canDrop(evaluator)) {
+return false;
+  }
+}
+return true;
+  }
+};
+  }
+
+  public static > LogicalExpression 
createBooleanPredicate(
+  String function,
+  String name,
+  List args,
+  ExpressionPosition pos
+  ) {
+switch (function) {
+  case "booleanOr":
+return ParquetBooleanPredicate.createOrPredicate(name, args, pos);
+  case "booleanAnd":
+return ParquetBooleanPredicate.createAndPredicate(name, args, pos);
+  default:
+logger.warn("Unknown Boolean '{}' predicate.", function);
+return null;
+}
+  }
+}
diff --git 
a/exec/java-exec/src/main/java/org/apache/drill/exec/expr/stat/ParquetBooleanPredicates.java
 
b/exec/java-exec/src/main/java/org/apache/drill/exec/expr/stat/ParquetBooleanPredicates.java
deleted file mode 100644
index e5de34fc9d..00
--- 
a/exec/java-exec/src/main/java/org/apache/drill/exec/expr/stat/ParquetBooleanPredicates.java
+++ /dev/null
@@ -1,76 +0,0 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one
- * or more contributor license agreements.  See the NOTICE file
- * distributed with this work for additional information
- * regarding copyright 

[jira] [Commented] (DRILL-6479) Support for EMIT outcome in Hash Aggregate

2018-06-11 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16509097#comment-16509097
 ] 

ASF GitHub Bot commented on DRILL-6479:
---

sohami commented on a change in pull request #1311: DRILL-6479: Support EMIT 
for the Hash Aggr
URL: https://github.com/apache/drill/pull/1311#discussion_r194601868
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/aggregate/HashAggTemplate.java
 ##
 @@ -573,6 +574,26 @@ public AggOutcome doWork() {
 delayedSetup();
   }
 
+  // If the prior call finished handling a prior incoming EMIT, need to 
call next() again
+  if ( !handleEmit && outcome == IterOutcome.EMIT ) {
 
 Review comment:
   As discussed please try following:
   1) Try to remove this code as its a duplicate and you will end up handling 
all the IterOutcomes again. Instead try setting `currentBatchRecordCount` to 0 
once you are done processing all the incoming records. That way this code can 
be totally removed.
   2) Please add few unit tests
   3) Add an exception in case spilling happens in EMIT scenario.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Support for EMIT outcome in Hash Aggregate
> --
>
> Key: DRILL-6479
> URL: https://issues.apache.org/jira/browse/DRILL-6479
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Execution - Relational Operators
>Reporter: Boaz Ben-Zvi
>Assignee: Boaz Ben-Zvi
>Priority: Major
> Fix For: 1.14.0
>
>
> With the new Lateral and Unnest -- if a Hash-Aggregate operator is present in 
> the sub-query, then it needs to handle the EMIT outcome correctly. This means 
> that when a EMIT is received then perform the aggregation operation on the 
> records buffered so far and produce the output with it. After handling an 
> EMIT the Hash-Aggr should refresh it's state and a continue to work on the 
> next batches of incoming records unless an EMIT is seen again.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-4364) Image Metadata Format Plugin

2018-06-11 Thread Bridget Bevens (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-4364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16509091#comment-16509091
 ] 

Bridget Bevens commented on DRILL-4364:
---

Added doc to the Apache Drill site: 
http://drill.apache.org/docs/image-metadata-format-plugin/
http://drill.apache.org/docs/plugin-configuration-basics/#list-of-attributes-and-definitions
 

Set doc label to doc-complete. Please let me know if you see any issues with 
the posted doc.

Thanks,
Bridget 


> Image Metadata Format Plugin
> 
>
> Key: DRILL-4364
> URL: https://issues.apache.org/jira/browse/DRILL-4364
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Storage - Other
>Reporter: Akihiko Kusanagi
>Assignee: Akihiko Kusanagi
>Priority: Major
>  Labels: doc-complete, ready-to-commit
> Fix For: 1.14.0
>
>
> Support querying of metadata in various image formats. This plugin leverages 
> [metadata-extractor|https://github.com/drewnoakes/metadata-extractor]. This 
> plugin is especially useful when querying on a large number of image files 
> stored in a distributed file system without building metadata repository in 
> advance.
> This plugin supports the following file formats.
>  * JPEG, TIFF, PSD, PNG, BMP, GIF, ICO, PCX, WAV, AVI, WebP, MOV, MP4, EPS
>  * Camera Raw: ARW (Sony), CRW/CR2 (Canon), NEF (Nikon), ORF (Olympus), RAF 
> (FujiFilm), RW2 (Panasonic), RWL (Leica), SRW (Samsung), X3F (Foveon)
> This plugin enables to read the following metadata.
>  * Exif, IPTC, XMP, JFIF / JFXX, ICC Profiles, Photoshop fields, PNG 
> properties, BMP properties, GIF properties, ICO properties, PCX properties, 
> WAV properties, AVI properties, WebP properties, QuickTime properties, MP4 
> properties, EPS properties
> Since each type of metadata has a different set of fields, the plugin returns 
> a set of commonly-used fields such as the image width, height and bits per 
> pixels for ease of use.
> *Examples:*
> Querying on a JPEG file with the property descriptive: true
> {noformat}
> 0: jdbc:drill:zk=local> select FileName, * from 
> dfs.`4349313028_f69ffa0257_o.jpg`;
> +--+--+--+++-+--+--+---++---+--+--++---++-+-+--+--+--++--+-+---+---+--+-+--+
> | FileName | FileSize | FileDateTime | Format | PixelWidth | PixelHeight | 
> BitsPerPixel | DPIWidth | DPIHeight | Orientaion | ColorMode | HasAlpha | 
> Duration | VideoCodec | FrameRate | AudioCodec | AudioSampleSize | 
> AudioSampleRate | JPEG | JFIF | ExifIFD0 | ExifSubIFD | Interoperability | 
> GPS | ExifThumbnail | Photoshop | IPTC | Huffman | FileType |
> +--+--+--+++-+--+--+---++---+--+--++---++-+-+--+--+--++--+-+---+---+--+-+--+
> | 4349313028_f69ffa0257_o.jpg | 257213 bytes | Fri Mar 09 12:09:34 +08:00 
> 2018 | JPEG | 1199 | 800 | 24 | 96 | 96 | Unknown (0) | RGB | false | 
> 00:00:00 | Unknown | 0 | Unknown | 0 | 0 | 
> {"CompressionType":"Baseline","DataPrecision":"8 bits","ImageHeight":"800 
> pixels","ImageWidth":"1199 pixels","NumberOfComponents":"3","Component1":"Y 
> component: Quantization table 0, Sampling factors 2 horiz/2 
> vert","Component2":"Cb component: Quantization table 1, Sampling factors 1 
> horiz/1 vert","Component3":"Cr component: Quantization table 1, Sampling 
> factors 1 horiz/1 vert"} | 
> {"Version":"1.1","ResolutionUnits":"inch","XResolution":"96 
> dots","YResolution":"96 
> dots","ThumbnailWidthPixels":"0","ThumbnailHeightPixels":"0"} | 
> {"Software":"Picasa 3.0"} | 
> {"ExifVersion":"2.10","UniqueImageID":"d65e93b836d15a0c5e041e6b7258c76e"} | 
> {"InteroperabilityIndex":"Unknown ()","InteroperabilityVersion":"1.00"} | 
> {"GPSVersionID":".022","GPSLatitudeRef":"N","GPSLatitude":"47° 32' 
> 15.98\"","GPSLongitudeRef":"W","GPSLongitude":"-122° 2' 
> 6.37\"","GPSAltitudeRef":"Sea level","GPSAltitude":"0 metres"} | 
> {"Compression":"JPEG (old-style)","XResolution":"72 dots per 
> inch","YResolution":"72 dots per 
> inch","ResolutionUnit":"Inch","ThumbnailOffset":"414 
> bytes","ThumbnailLength":"7213 bytes"} | {} | 
> {"Keywords":"135;2002;issaquah;police car;wa;washington"} | 
> {"NumberOfTables":"4 Huffman tables"} | 
> {"DetectedFileTypeName":"JPEG","DetectedFileTypeLongName":"Joint Photographic 
> Experts 
> 

[jira] [Updated] (DRILL-4364) Image Metadata Format Plugin

2018-06-11 Thread Bridget Bevens (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-4364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bridget Bevens updated DRILL-4364:
--
Labels: doc-complete ready-to-commit  (was: doc-impacting ready-to-commit)

> Image Metadata Format Plugin
> 
>
> Key: DRILL-4364
> URL: https://issues.apache.org/jira/browse/DRILL-4364
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Storage - Other
>Reporter: Akihiko Kusanagi
>Assignee: Akihiko Kusanagi
>Priority: Major
>  Labels: doc-complete, ready-to-commit
> Fix For: 1.14.0
>
>
> Support querying of metadata in various image formats. This plugin leverages 
> [metadata-extractor|https://github.com/drewnoakes/metadata-extractor]. This 
> plugin is especially useful when querying on a large number of image files 
> stored in a distributed file system without building metadata repository in 
> advance.
> This plugin supports the following file formats.
>  * JPEG, TIFF, PSD, PNG, BMP, GIF, ICO, PCX, WAV, AVI, WebP, MOV, MP4, EPS
>  * Camera Raw: ARW (Sony), CRW/CR2 (Canon), NEF (Nikon), ORF (Olympus), RAF 
> (FujiFilm), RW2 (Panasonic), RWL (Leica), SRW (Samsung), X3F (Foveon)
> This plugin enables to read the following metadata.
>  * Exif, IPTC, XMP, JFIF / JFXX, ICC Profiles, Photoshop fields, PNG 
> properties, BMP properties, GIF properties, ICO properties, PCX properties, 
> WAV properties, AVI properties, WebP properties, QuickTime properties, MP4 
> properties, EPS properties
> Since each type of metadata has a different set of fields, the plugin returns 
> a set of commonly-used fields such as the image width, height and bits per 
> pixels for ease of use.
> *Examples:*
> Querying on a JPEG file with the property descriptive: true
> {noformat}
> 0: jdbc:drill:zk=local> select FileName, * from 
> dfs.`4349313028_f69ffa0257_o.jpg`;
> +--+--+--+++-+--+--+---++---+--+--++---++-+-+--+--+--++--+-+---+---+--+-+--+
> | FileName | FileSize | FileDateTime | Format | PixelWidth | PixelHeight | 
> BitsPerPixel | DPIWidth | DPIHeight | Orientaion | ColorMode | HasAlpha | 
> Duration | VideoCodec | FrameRate | AudioCodec | AudioSampleSize | 
> AudioSampleRate | JPEG | JFIF | ExifIFD0 | ExifSubIFD | Interoperability | 
> GPS | ExifThumbnail | Photoshop | IPTC | Huffman | FileType |
> +--+--+--+++-+--+--+---++---+--+--++---++-+-+--+--+--++--+-+---+---+--+-+--+
> | 4349313028_f69ffa0257_o.jpg | 257213 bytes | Fri Mar 09 12:09:34 +08:00 
> 2018 | JPEG | 1199 | 800 | 24 | 96 | 96 | Unknown (0) | RGB | false | 
> 00:00:00 | Unknown | 0 | Unknown | 0 | 0 | 
> {"CompressionType":"Baseline","DataPrecision":"8 bits","ImageHeight":"800 
> pixels","ImageWidth":"1199 pixels","NumberOfComponents":"3","Component1":"Y 
> component: Quantization table 0, Sampling factors 2 horiz/2 
> vert","Component2":"Cb component: Quantization table 1, Sampling factors 1 
> horiz/1 vert","Component3":"Cr component: Quantization table 1, Sampling 
> factors 1 horiz/1 vert"} | 
> {"Version":"1.1","ResolutionUnits":"inch","XResolution":"96 
> dots","YResolution":"96 
> dots","ThumbnailWidthPixels":"0","ThumbnailHeightPixels":"0"} | 
> {"Software":"Picasa 3.0"} | 
> {"ExifVersion":"2.10","UniqueImageID":"d65e93b836d15a0c5e041e6b7258c76e"} | 
> {"InteroperabilityIndex":"Unknown ()","InteroperabilityVersion":"1.00"} | 
> {"GPSVersionID":".022","GPSLatitudeRef":"N","GPSLatitude":"47° 32' 
> 15.98\"","GPSLongitudeRef":"W","GPSLongitude":"-122° 2' 
> 6.37\"","GPSAltitudeRef":"Sea level","GPSAltitude":"0 metres"} | 
> {"Compression":"JPEG (old-style)","XResolution":"72 dots per 
> inch","YResolution":"72 dots per 
> inch","ResolutionUnit":"Inch","ThumbnailOffset":"414 
> bytes","ThumbnailLength":"7213 bytes"} | {} | 
> {"Keywords":"135;2002;issaquah;police car;wa;washington"} | 
> {"NumberOfTables":"4 Huffman tables"} | 
> {"DetectedFileTypeName":"JPEG","DetectedFileTypeLongName":"Joint Photographic 
> Experts 
> Group","DetectedMIMEType":"image/jpeg","ExpectedFileNameExtension":"jpg"} |
> 

[jira] [Commented] (DRILL-6422) Update guava to 23.0 and shade it

2018-06-11 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16509040#comment-16509040
 ] 

ASF GitHub Bot commented on DRILL-6422:
---

vrozov commented on a change in pull request #1264:  DRILL-6422: Update guava 
to 23.0 and shade it
URL: https://github.com/apache/drill/pull/1264#discussion_r194579907
 
 

 ##
 File path: contrib/guava-shaded/pom.xml
 ##
 @@ -0,0 +1,141 @@
+
+
+http://maven.apache.org/POM/4.0.0; 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance; 
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 
http://maven.apache.org/xsd/maven-4.0.0.xsd;>
+  4.0.0
+
+  
+drill-contrib-parent
+org.apache.drill.contrib
+1.14.0-SNAPSHOT
+  
+  guava-shaded
 
 Review comment:
   add version. Use the same version as the guava library that is shaded. There 
is no dependency on the drill code, so there is no need to have a `SNAPSHOT` 
version. If you introduce drill-shaded, it also should have fixed version. Do 
not include drill root pom as a parent (use apache as the parent pom), so it is 
not necessary to disable rat and checkstyle plugins or deal with dependencies 
listed in the Drill root pom.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Update guava to 23.0 and shade it
> -
>
> Key: DRILL-6422
> URL: https://issues.apache.org/jira/browse/DRILL-6422
> Project: Apache Drill
>  Issue Type: Task
>Reporter: Volodymyr Vysotskyi
>Assignee: Volodymyr Vysotskyi
>Priority: Major
> Fix For: 1.14.0
>
>
> Some hadoop libraries use old versions of guava and most of them are 
> incompatible with guava 23.0.
> To allow usage of new guava version, it should be shaded and shaded version 
> should be used in the project.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6422) Update guava to 23.0 and shade it

2018-06-11 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16509041#comment-16509041
 ] 

ASF GitHub Bot commented on DRILL-6422:
---

vrozov commented on a change in pull request #1264:  DRILL-6422: Update guava 
to 23.0 and shade it
URL: https://github.com/apache/drill/pull/1264#discussion_r194578996
 
 

 ##
 File path: contrib/guava-shaded/pom.xml
 ##
 @@ -0,0 +1,141 @@
+
 
 Review comment:
   I am not sure that contrib is the right place for shading guava. Consider 
introducing drill-shaded under drill root and move guava under it.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Update guava to 23.0 and shade it
> -
>
> Key: DRILL-6422
> URL: https://issues.apache.org/jira/browse/DRILL-6422
> Project: Apache Drill
>  Issue Type: Task
>Reporter: Volodymyr Vysotskyi
>Assignee: Volodymyr Vysotskyi
>Priority: Major
> Fix For: 1.14.0
>
>
> Some hadoop libraries use old versions of guava and most of them are 
> incompatible with guava 23.0.
> To allow usage of new guava version, it should be shaded and shaded version 
> should be used in the project.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6422) Update guava to 23.0 and shade it

2018-06-11 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16509042#comment-16509042
 ] 

ASF GitHub Bot commented on DRILL-6422:
---

vrozov commented on a change in pull request #1264:  DRILL-6422: Update guava 
to 23.0 and shade it
URL: https://github.com/apache/drill/pull/1264#discussion_r194584632
 
 

 ##
 File path: contrib/guava-shaded/pom.xml
 ##
 @@ -0,0 +1,141 @@
+
+
+http://maven.apache.org/POM/4.0.0; 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance; 
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 
http://maven.apache.org/xsd/maven-4.0.0.xsd;>
+  4.0.0
+
+  
+drill-contrib-parent
+org.apache.drill.contrib
+1.14.0-SNAPSHOT
+  
+  guava-shaded
+  contrib/guava-shaded
+
+  jar
+
+  
+
+false
+  
+
+  
+
+  com.google.guava
+  guava
+  ${dep.guava.version}
+  jar
+
+  
+
+  
+
+
${project.build.directory}/shaded-sources
+
+
+  
+maven-source-plugin
+
+  true
+
+  
+  
+org.apache.maven.plugins
+maven-shade-plugin
+3.1.0
+
+  
+package
+
+  shade
+
+
+  true
+  true
+  false
+  
+
+  com.google.guava:*
+
+  
+  
+
+  com.google.common
+  
org.apache.drill.shaded.com.google.common
 
 Review comment:
   consider adding `guava` to shaded package name 
(`org.apache.drill.shaded.guava`) and using `com.google` as the `pattern`. What 
is a reason not to shade packages other than `common`?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Update guava to 23.0 and shade it
> -
>
> Key: DRILL-6422
> URL: https://issues.apache.org/jira/browse/DRILL-6422
> Project: Apache Drill
>  Issue Type: Task
>Reporter: Volodymyr Vysotskyi
>Assignee: Volodymyr Vysotskyi
>Priority: Major
> Fix For: 1.14.0
>
>
> Some hadoop libraries use old versions of guava and most of them are 
> incompatible with guava 23.0.
> To allow usage of new guava version, it should be shaded and shaded version 
> should be used in the project.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (DRILL-6212) A simple join is recursing too deep in planning and eventually throwing stack overflow.

2018-06-11 Thread Gautam Kumar Parai (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16509017#comment-16509017
 ] 

Gautam Kumar Parai edited comment on DRILL-6212 at 6/12/18 12:57 AM:
-

I see that CALCITE-2223 is still open. We should go ahead and fix the issue on 
DRILL and open another DRILL Jira to revert/modify this fix once the CALCITE 
changes are complete.

 [~vvysotskyi] [~vrozov]  [~amansinha100] what do you think?


was (Author: gparai):
I see that CALCITE-2223 is still open. We should go ahead and fix the issue on 
DRILL and open another DRILL Jira to revert/modify this fix onceCALCITE changes 
are complete.

 [~vvysotskyi] [~vrozov]  [~amansinha100] what do you think?

> A simple join is recursing too deep in planning and eventually throwing stack 
> overflow.
> ---
>
> Key: DRILL-6212
> URL: https://issues.apache.org/jira/browse/DRILL-6212
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning  Optimization
>Affects Versions: 1.12.0
>Reporter: Hanumath Rao Maduri
>Assignee: Chunhui Shi
>Priority: Critical
> Fix For: 1.14.0
>
>
> Create two views using following statements.
> {code}
> create view v1 as select cast(greeting as int) f from 
> dfs.`/home/mapr/data/json/temp.json`;
> create view v2 as select cast(greeting as int) f from 
> dfs.`/home/mapr/data/json/temp.json`;
> {code}
> Executing the following join query produces a stack overflow during the 
> planning phase.
> {code}
> select t1.f from dfs.tmp.v1 as t inner join dfs.tmp.v2 as t1 on cast(t.f as 
> int) = cast(t1.f as int) and cast(t.f as int) = 10 and cast(t1.f as int) = 10;
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6212) A simple join is recursing too deep in planning and eventually throwing stack overflow.

2018-06-11 Thread Gautam Kumar Parai (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16509017#comment-16509017
 ] 

Gautam Kumar Parai commented on DRILL-6212:
---

I see that CALCITE-2223 is still open. We should go ahead and fix the issue on 
DRILL and open another DRILL Jira to revert/modify this fix onceCALCITE changes 
are complete.

 [~vvysotskyi] [~vrozov]  [~amansinha100] what do you think?

> A simple join is recursing too deep in planning and eventually throwing stack 
> overflow.
> ---
>
> Key: DRILL-6212
> URL: https://issues.apache.org/jira/browse/DRILL-6212
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning  Optimization
>Affects Versions: 1.12.0
>Reporter: Hanumath Rao Maduri
>Assignee: Chunhui Shi
>Priority: Critical
> Fix For: 1.14.0
>
>
> Create two views using following statements.
> {code}
> create view v1 as select cast(greeting as int) f from 
> dfs.`/home/mapr/data/json/temp.json`;
> create view v2 as select cast(greeting as int) f from 
> dfs.`/home/mapr/data/json/temp.json`;
> {code}
> Executing the following join query produces a stack overflow during the 
> planning phase.
> {code}
> select t1.f from dfs.tmp.v1 as t inner join dfs.tmp.v2 as t1 on cast(t.f as 
> int) = cast(t1.f as int) and cast(t.f as int) = 10 and cast(t1.f as int) = 10;
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-6488) Drill native client - compile error due to usage of "template inline"

2018-06-11 Thread Pritesh Maker (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-6488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pritesh Maker updated DRILL-6488:
-
Reviewer: Parth Chandra

> Drill native client - compile error due to usage of "template inline"
> -
>
> Key: DRILL-6488
> URL: https://issues.apache.org/jira/browse/DRILL-6488
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Patrick Wong
>Assignee: Patrick Wong
>Priority: Major
> Fix For: 1.14.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-6488) Drill native client - compile error due to usage of "template inline"

2018-06-11 Thread Pritesh Maker (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-6488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pritesh Maker updated DRILL-6488:
-
Fix Version/s: 1.14.0

> Drill native client - compile error due to usage of "template inline"
> -
>
> Key: DRILL-6488
> URL: https://issues.apache.org/jira/browse/DRILL-6488
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Patrick Wong
>Assignee: Patrick Wong
>Priority: Major
> Fix For: 1.14.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6488) Drill native client - compile error due to usage of "template inline"

2018-06-11 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16508981#comment-16508981
 ] 

ASF GitHub Bot commented on DRILL-6488:
---

pwong-mapr opened a new pull request #1317: DRILL-6488 - change instances of 
"template inline" to just "template"
URL: https://github.com/apache/drill/pull/1317
 
 
   https://issues.apache.org/jira/browse/DRILL-6488


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Drill native client - compile error due to usage of "template inline"
> -
>
> Key: DRILL-6488
> URL: https://issues.apache.org/jira/browse/DRILL-6488
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Patrick Wong
>Assignee: Patrick Wong
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-6488) Drill native client - compile error due to usage of "template inline"

2018-06-11 Thread Patrick Wong (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-6488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wong updated DRILL-6488:

Summary: Drill native client - compile error due to usage of "template 
inline"  (was: Drill native client - compile error due to usage of "inline")

> Drill native client - compile error due to usage of "template inline"
> -
>
> Key: DRILL-6488
> URL: https://issues.apache.org/jira/browse/DRILL-6488
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Patrick Wong
>Assignee: Patrick Wong
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (DRILL-6488) Drill native client - compile error due to usage of "inline"

2018-06-11 Thread Patrick Wong (JIRA)
Patrick Wong created DRILL-6488:
---

 Summary: Drill native client - compile error due to usage of 
"inline"
 Key: DRILL-6488
 URL: https://issues.apache.org/jira/browse/DRILL-6488
 Project: Apache Drill
  Issue Type: Bug
Reporter: Patrick Wong
Assignee: Patrick Wong






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6422) Update guava to 23.0 and shade it

2018-06-11 Thread Pritesh Maker (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16508974#comment-16508974
 ] 

Pritesh Maker commented on DRILL-6422:
--

[~vrozov] can you review this today?

> Update guava to 23.0 and shade it
> -
>
> Key: DRILL-6422
> URL: https://issues.apache.org/jira/browse/DRILL-6422
> Project: Apache Drill
>  Issue Type: Task
>Reporter: Volodymyr Vysotskyi
>Assignee: Volodymyr Vysotskyi
>Priority: Major
> Fix For: 1.14.0
>
>
> Some hadoop libraries use old versions of guava and most of them are 
> incompatible with guava 23.0.
> To allow usage of new guava version, it should be shaded and shaded version 
> should be used in the project.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-6478) enhance debug logs for batch sizing

2018-06-11 Thread Sorabh Hamirwasia (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-6478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sorabh Hamirwasia updated DRILL-6478:
-
Labels: ready-to-commit  (was: )

> enhance debug logs for batch sizing
> ---
>
> Key: DRILL-6478
> URL: https://issues.apache.org/jira/browse/DRILL-6478
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Padma Penumarthy
>Assignee: Padma Penumarthy
>Priority: Major
>  Labels: ready-to-commit
> Fix For: 1.14.0
>
>
> Fix some issues with debug logs so QA  scripts work better. Also, added batch 
> sizing logs for union all.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6478) enhance debug logs for batch sizing

2018-06-11 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16508937#comment-16508937
 ] 

ASF GitHub Bot commented on DRILL-6478:
---

sohami commented on issue #1310: DRILL-6478: enhance debug logs for batch sizing
URL: https://github.com/apache/drill/pull/1310#issuecomment-396417335
 
 
   LGTM +1


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> enhance debug logs for batch sizing
> ---
>
> Key: DRILL-6478
> URL: https://issues.apache.org/jira/browse/DRILL-6478
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Padma Penumarthy
>Assignee: Padma Penumarthy
>Priority: Major
>  Labels: ready-to-commit
> Fix For: 1.14.0
>
>
> Fix some issues with debug logs so QA  scripts work better. Also, added batch 
> sizing logs for union all.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-6478) enhance debug logs for batch sizing

2018-06-11 Thread Pritesh Maker (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-6478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pritesh Maker updated DRILL-6478:
-
Reviewer: Sorabh Hamirwasia

> enhance debug logs for batch sizing
> ---
>
> Key: DRILL-6478
> URL: https://issues.apache.org/jira/browse/DRILL-6478
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Padma Penumarthy
>Assignee: Padma Penumarthy
>Priority: Major
> Fix For: 1.14.0
>
>
> Fix some issues with debug logs so QA  scripts work better. Also, added batch 
> sizing logs for union all.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (DRILL-6487) Negative row count when selecting from a json file with an OFFSET clause

2018-06-11 Thread Boaz Ben-Zvi (JIRA)
Boaz Ben-Zvi created DRILL-6487:
---

 Summary: Negative row count when selecting from a json file with 
an OFFSET clause
 Key: DRILL-6487
 URL: https://issues.apache.org/jira/browse/DRILL-6487
 Project: Apache Drill
  Issue Type: Bug
  Components: Query Planning  Optimization
Affects Versions: 1.13.0
Reporter: Boaz Ben-Zvi
 Fix For: 1.14.0


This simple query fails: 

{code}
select * from dfs.`/data/foo.json` offset 1 row;
{code}

where foo.json is 
{code}
{"key": "aa", "sales": 11}
{"key": "bb", "sales": 22}
{code}

The error returned is:
{code}
0: jdbc:drill:zk=local> select * from dfs.`/data/foo.json` offset 1 row;
Error: SYSTEM ERROR: AssertionError


[Error Id: 960d66a9-b480-4a7e-9a25-beb4928e8139 on 10.254.130.25:31020]

  (org.apache.drill.exec.work.foreman.ForemanException) Unexpected exception 
during fragment initialization: null
org.apache.drill.exec.work.foreman.Foreman.run():282
java.util.concurrent.ThreadPoolExecutor.runWorker():1142
java.util.concurrent.ThreadPoolExecutor$Worker.run():617
java.lang.Thread.run():745
  Caused By (java.lang.AssertionError) null
org.apache.calcite.rel.metadata.RelMetadataQuery.isNonNegative():900
org.apache.calcite.rel.metadata.RelMetadataQuery.validateResult():919
org.apache.calcite.rel.metadata.RelMetadataQuery.getRowCount():236
org.apache.calcite.rel.SingleRel.estimateRowCount():68

org.apache.drill.exec.planner.physical.visitor.ExcessiveExchangeIdentifier$MajorFragmentStat.add():103

org.apache.drill.exec.planner.physical.visitor.ExcessiveExchangeIdentifier.visitPrel():76

org.apache.drill.exec.planner.physical.visitor.ExcessiveExchangeIdentifier.visitPrel():32

org.apache.drill.exec.planner.physical.visitor.BasePrelVisitor.visitProject():50
org.apache.drill.exec.planner.physical.ProjectPrel.accept():98

org.apache.drill.exec.planner.physical.visitor.ExcessiveExchangeIdentifier.visitScreen():63

org.apache.drill.exec.planner.physical.visitor.ExcessiveExchangeIdentifier.visitScreen():32
org.apache.drill.exec.planner.physical.ScreenPrel.accept():65

org.apache.drill.exec.planner.physical.visitor.ExcessiveExchangeIdentifier.removeExcessiveEchanges():41

org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.convertToPrel():557
org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.getPlan():179
org.apache.drill.exec.planner.sql.DrillSqlWorker.getQueryPlan():145
org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan():83
org.apache.drill.exec.work.foreman.Foreman.runSQL():567
org.apache.drill.exec.work.foreman.Foreman.run():264
java.util.concurrent.ThreadPoolExecutor.runWorker():1142
java.util.concurrent.ThreadPoolExecutor$Worker.run():617
java.lang.Thread.run():745 (state=,code=0)
{code}





--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (DRILL-6486) BitVector split and transfer does not work correctly for non byte-multiple transfer lengths

2018-06-11 Thread Pritesh Maker (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-6486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pritesh Maker reassigned DRILL-6486:


Assignee: Karthikeyan Manivannan

> BitVector split and transfer does not work correctly for non byte-multiple 
> transfer lengths
> ---
>
> Key: DRILL-6486
> URL: https://issues.apache.org/jira/browse/DRILL-6486
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Data Types
>Affects Versions: 1.13.0
>Reporter: Karthikeyan Manivannan
>Assignee: Karthikeyan Manivannan
>Priority: Major
> Fix For: 1.14.0
>
> Attachments: TestSplitAndTransfer.java
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> BitVector splitAndTransfer does not correctly handle transfers where the 
> transfer-length is not a multiple of 8. The attached bitVector tests will 
> expose this problem. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (DRILL-6486) BitVector split and transfer does not work correctly for non byte-multiple transfer lengths

2018-06-11 Thread Pritesh Maker (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-6486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pritesh Maker reassigned DRILL-6486:


Assignee: (was: Karthikeyan Manivannan)

> BitVector split and transfer does not work correctly for non byte-multiple 
> transfer lengths
> ---
>
> Key: DRILL-6486
> URL: https://issues.apache.org/jira/browse/DRILL-6486
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Data Types
>Affects Versions: 1.13.0
>Reporter: Karthikeyan Manivannan
>Priority: Major
> Fix For: 1.14.0
>
> Attachments: TestSplitAndTransfer.java
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> BitVector splitAndTransfer does not correctly handle transfers where the 
> transfer-length is not a multiple of 8. The attached bitVector tests will 
> expose this problem. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Reopened] (DRILL-6486) BitVector split and transfer does not work correctly for non byte-multiple transfer lengths

2018-06-11 Thread Pritesh Maker (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-6486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pritesh Maker reopened DRILL-6486:
--

> BitVector split and transfer does not work correctly for non byte-multiple 
> transfer lengths
> ---
>
> Key: DRILL-6486
> URL: https://issues.apache.org/jira/browse/DRILL-6486
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Data Types
>Affects Versions: 1.13.0
>Reporter: Karthikeyan Manivannan
>Assignee: Karthikeyan Manivannan
>Priority: Major
> Fix For: 1.14.0
>
> Attachments: TestSplitAndTransfer.java
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> BitVector splitAndTransfer does not correctly handle transfers where the 
> transfer-length is not a multiple of 8. The attached bitVector tests will 
> expose this problem. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-6476) Generate explain plan which shows relation between Lateral and the corresponding Unnest.

2018-06-11 Thread Pritesh Maker (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-6476?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pritesh Maker updated DRILL-6476:
-
Labels: ready-to-commit  (was: )

> Generate explain plan which shows relation between Lateral and the 
> corresponding Unnest.
> 
>
> Key: DRILL-6476
> URL: https://issues.apache.org/jira/browse/DRILL-6476
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning  Optimization
>Affects Versions: 1.14.0
>Reporter: Hanumath Rao Maduri
>Assignee: Hanumath Rao Maduri
>Priority: Major
>  Labels: ready-to-commit
> Fix For: 1.14.0
>
>
> Currently, explain plan doesn't show that which lateral and  unnest node's 
> are related. This information is good to have so that the visual plan can use 
> it and show the relation visually.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (DRILL-6486) BitVector split and transfer does not work correctly for non byte-multiple transfer lengths

2018-06-11 Thread Karthikeyan Manivannan (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-6486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthikeyan Manivannan resolved DRILL-6486.
---
Resolution: Fixed

> BitVector split and transfer does not work correctly for non byte-multiple 
> transfer lengths
> ---
>
> Key: DRILL-6486
> URL: https://issues.apache.org/jira/browse/DRILL-6486
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Data Types
>Affects Versions: 1.13.0
>Reporter: Karthikeyan Manivannan
>Assignee: Karthikeyan Manivannan
>Priority: Major
> Fix For: 1.14.0
>
> Attachments: TestSplitAndTransfer.java
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> BitVector splitAndTransfer does not correctly handle transfers where the 
> transfer-length is not a multiple of 8. The attached bitVector tests will 
> expose this problem. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-6476) Generate explain plan which shows relation between Lateral and the corresponding Unnest.

2018-06-11 Thread Pritesh Maker (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-6476?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pritesh Maker updated DRILL-6476:
-
Fix Version/s: 1.14.0

> Generate explain plan which shows relation between Lateral and the 
> corresponding Unnest.
> 
>
> Key: DRILL-6476
> URL: https://issues.apache.org/jira/browse/DRILL-6476
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning  Optimization
>Affects Versions: 1.14.0
>Reporter: Hanumath Rao Maduri
>Assignee: Hanumath Rao Maduri
>Priority: Major
>  Labels: ready-to-commit
> Fix For: 1.14.0
>
>
> Currently, explain plan doesn't show that which lateral and  unnest node's 
> are related. This information is good to have so that the visual plan can use 
> it and show the relation visually.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6486) BitVector split and transfer does not work correctly for non byte-multiple transfer lengths

2018-06-11 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16508741#comment-16508741
 ] 

ASF GitHub Bot commented on DRILL-6486:
---

bitblender opened a new pull request #1316: DRILL-6486: BitVector split and 
transfer does not work correctly for non byte-multiple transfer lengths
URL: https://github.com/apache/drill/pull/1316
 
 
   Fix for the bug in BitVector splitAndTransfer. The logic for handling copy 
of last-n bits was incorrect for none byte-multiple transfer lengths. This PR 
fixes this problem and adds new tests which simulate the failing condition and 
also exercise other changes in this PR.
   
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> BitVector split and transfer does not work correctly for non byte-multiple 
> transfer lengths
> ---
>
> Key: DRILL-6486
> URL: https://issues.apache.org/jira/browse/DRILL-6486
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Data Types
>Affects Versions: 1.13.0
>Reporter: Karthikeyan Manivannan
>Assignee: Karthikeyan Manivannan
>Priority: Major
> Fix For: 1.14.0
>
> Attachments: TestSplitAndTransfer.java
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> BitVector splitAndTransfer does not correctly handle transfers where the 
> transfer-length is not a multiple of 8. The attached bitVector tests will 
> expose this problem. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6476) Generate explain plan which shows relation between Lateral and the corresponding Unnest.

2018-06-11 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16508687#comment-16508687
 ] 

ASF GitHub Bot commented on DRILL-6476:
---

amansinha100 commented on a change in pull request #1308: DRILL-6476: Generate 
explain plan which shows relation between Latera…
URL: https://github.com/apache/drill/pull/1308#discussion_r194536532
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/explain/NumberingRelWriter.java
 ##
 @@ -71,7 +76,7 @@ protected void explain_(
 RelMetadataQuery mq = RelMetadataQuery.instance();
 if (!mq.isVisibleInExplain(rel, detailLevel)) {
   // render children in place of this, at same level
-  explainInputs(inputs);
+  explainInputs(rel);
 
 Review comment:
   Ok, in that case I suppose the different alias names are not accessible from 
within the unnest operator.  


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Generate explain plan which shows relation between Lateral and the 
> corresponding Unnest.
> 
>
> Key: DRILL-6476
> URL: https://issues.apache.org/jira/browse/DRILL-6476
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning  Optimization
>Affects Versions: 1.14.0
>Reporter: Hanumath Rao Maduri
>Assignee: Hanumath Rao Maduri
>Priority: Major
>
> Currently, explain plan doesn't show that which lateral and  unnest node's 
> are related. This information is good to have so that the visual plan can use 
> it and show the relation visually.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (DRILL-6486) BitVector split and transfer does not work correctly for non byte-multiple transfer lengths

2018-06-11 Thread Karthikeyan Manivannan (JIRA)
Karthikeyan Manivannan created DRILL-6486:
-

 Summary: BitVector split and transfer does not work correctly for 
non byte-multiple transfer lengths
 Key: DRILL-6486
 URL: https://issues.apache.org/jira/browse/DRILL-6486
 Project: Apache Drill
  Issue Type: Bug
  Components: Execution - Data Types
Affects Versions: 1.13.0
Reporter: Karthikeyan Manivannan
Assignee: Karthikeyan Manivannan
 Fix For: 1.14.0
 Attachments: TestSplitAndTransfer.java

BitVector splitAndTransfer does not correctly handle transfers where the 
transfer-length is not a multiple of 8. The attached bitVector tests will 
expose this problem. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-6486) BitVector split and transfer does not work correctly for non byte-multiple transfer lengths

2018-06-11 Thread Karthikeyan Manivannan (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-6486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthikeyan Manivannan updated DRILL-6486:
--
Attachment: TestSplitAndTransfer.java

> BitVector split and transfer does not work correctly for non byte-multiple 
> transfer lengths
> ---
>
> Key: DRILL-6486
> URL: https://issues.apache.org/jira/browse/DRILL-6486
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Data Types
>Affects Versions: 1.13.0
>Reporter: Karthikeyan Manivannan
>Assignee: Karthikeyan Manivannan
>Priority: Major
> Fix For: 1.14.0
>
> Attachments: TestSplitAndTransfer.java
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> BitVector splitAndTransfer does not correctly handle transfers where the 
> transfer-length is not a multiple of 8. The attached bitVector tests will 
> expose this problem. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6454) Native MapR DB plugin support for Hive MapR-DB json table

2018-06-11 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16508621#comment-16508621
 ] 

ASF GitHub Bot commented on DRILL-6454:
---

gparai commented on issue #1314: DRILL-6454: Native MapR DB plugin support for 
Hive MapR-DB json table
URL: https://github.com/apache/drill/pull/1314#issuecomment-396361770
 
 
   @vdiravka do you have any writeup/dspec for this feature?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Native MapR DB plugin support for Hive MapR-DB json table
> -
>
> Key: DRILL-6454
> URL: https://issues.apache.org/jira/browse/DRILL-6454
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Storage - Hive, Storage - MapRDB
>Affects Versions: 1.13.0
>Reporter: Vitalii Diravka
>Assignee: Vitalii Diravka
>Priority: Major
> Fix For: 1.14.0
>
>
> Hive can create and query MapR-DB tables via maprdb-json-handler:
> https://maprdocs.mapr.com/home/Hive/ConnectingToMapR-DB.html
> The aim of this Jira to implement Drill native reader for Hive MapR-DB tables 
> (similar to parquet).
> Design proposal is:
> - to use JsonTableGroupScan instead of HiveScan;
> - to add storage planning rule to convert HiveScan to MapRDBGroupScan;
> - to add system/session option to enable using of this native reader;
> - native reader can be used only for Drill build with mapr profile (there is 
> no reason to leverage it for default profile);
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6476) Generate explain plan which shows relation between Lateral and the corresponding Unnest.

2018-06-11 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16508589#comment-16508589
 ] 

ASF GitHub Bot commented on DRILL-6476:
---

HanumathRao commented on a change in pull request #1308: DRILL-6476: Generate 
explain plan which shows relation between Latera…
URL: https://github.com/apache/drill/pull/1308#discussion_r194517321
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/explain/NumberingRelWriter.java
 ##
 @@ -71,7 +76,7 @@ protected void explain_(
 RelMetadataQuery mq = RelMetadataQuery.instance();
 if (!mq.isVisibleInExplain(rel, detailLevel)) {
   // render children in place of this, at same level
-  explainInputs(inputs);
+  explainInputs(rel);
 
 Review comment:
   @amansinha100  I did consider about making changes to explainTerms, but it 
is not working for the two unnest nodes with same column name. Hence I went 
with this approach.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Generate explain plan which shows relation between Lateral and the 
> corresponding Unnest.
> 
>
> Key: DRILL-6476
> URL: https://issues.apache.org/jira/browse/DRILL-6476
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning  Optimization
>Affects Versions: 1.14.0
>Reporter: Hanumath Rao Maduri
>Assignee: Hanumath Rao Maduri
>Priority: Major
>
> Currently, explain plan doesn't show that which lateral and  unnest node's 
> are related. This information is good to have so that the visual plan can use 
> it and show the relation visually.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6476) Generate explain plan which shows relation between Lateral and the corresponding Unnest.

2018-06-11 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16508562#comment-16508562
 ] 

ASF GitHub Bot commented on DRILL-6476:
---

HanumathRao commented on issue #1308: DRILL-6476: Generate explain plan which 
shows relation between Latera…
URL: https://github.com/apache/drill/pull/1308#issuecomment-396352114
 
 
   @kkhatua @amansinha100  Thank you for the review. I have made the required 
code changes. Please let me know if anything needs to be changed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Generate explain plan which shows relation between Lateral and the 
> corresponding Unnest.
> 
>
> Key: DRILL-6476
> URL: https://issues.apache.org/jira/browse/DRILL-6476
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning  Optimization
>Affects Versions: 1.14.0
>Reporter: Hanumath Rao Maduri
>Assignee: Hanumath Rao Maduri
>Priority: Major
>
> Currently, explain plan doesn't show that which lateral and  unnest node's 
> are related. This information is good to have so that the visual plan can use 
> it and show the relation visually.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6340) Output Batch Control in Project using the RecordBatchSizer

2018-06-11 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16508551#comment-16508551
 ] 

ASF GitHub Bot commented on DRILL-6340:
---

bitblender commented on a change in pull request #1302: DRILL-6340: Output 
Batch Control in Project using the RecordBatchSizer
URL: https://github.com/apache/drill/pull/1302#discussion_r194512650
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/project/ProjectRecordBatch.java
 ##
 @@ -193,6 +209,13 @@ protected IterOutcome doWork() {
 }
   }
   incomingRecordCount = incoming.getRecordCount();
+  memoryManager.update();
+  if (logger.isTraceEnabled()) {
+logger.trace("doWork():[1] memMgr RC " + 
memoryManager.getOutputRowCount()
 
 Review comment:
   Ok. Will do. But what is the advantage of using '{}' over static string 
concatenation with a check? I think format strings are more error(argument 
ordering errors) prone than concatenation.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Output Batch Control in Project using the RecordBatchSizer
> --
>
> Key: DRILL-6340
> URL: https://issues.apache.org/jira/browse/DRILL-6340
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Execution - Relational Operators
>Reporter: Karthikeyan Manivannan
>Assignee: Karthikeyan Manivannan
>Priority: Major
> Fix For: 1.14.0
>
>
> This bug is for tracking the changes required to implement Output Batch 
> Sizing in Project using the RecordBatchSizer. The challenge in doing this 
> mainly lies in dealing with expressions that produce variable-length columns. 
> The following doc talks about some of the design approaches for dealing with 
> such variable-length columns.
> [https://docs.google.com/document/d/1h0WsQsen6xqqAyyYSrtiAniQpVZGmQNQqC1I2DJaxAA/edit?usp=sharing]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6477) Drillbit hangs/crashes with OOME Java Heap Space for a large query through WebUI

2018-06-11 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16508545#comment-16508545
 ] 

ASF GitHub Bot commented on DRILL-6477:
---

kkhatua commented on issue #1309: DRILL-6477: Drillbit crashes with OOME (Heap) 
for a large WebUI query
URL: https://github.com/apache/drill/pull/1309#issuecomment-396348998
 
 
   @parthchandra  updated the PR based on your review.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Drillbit hangs/crashes with OOME Java Heap Space for a large query through 
> WebUI
> 
>
> Key: DRILL-6477
> URL: https://issues.apache.org/jira/browse/DRILL-6477
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Web Server
>Affects Versions: 1.13.0
>Reporter: Kunal Khatua
>Assignee: Kunal Khatua
>Priority: Major
> Fix For: 1.14.0
>
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> For queries submitted through the WebUI and retrieving a large resultset, the 
> Drillbit often hangs or crashes due to the (foreman) Drillbit running out of 
> Heap memory.
> This is because the Web client translates the resultset into a massive object 
> in the heap-space and tries to send that back to the browser. This results in 
> the VM thread actively trying to perform GC if the memory is not sufficient.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6477) Drillbit hangs/crashes with OOME Java Heap Space for a large query through WebUI

2018-06-11 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16508540#comment-16508540
 ] 

ASF GitHub Bot commented on DRILL-6477:
---

parthchandra commented on a change in pull request #1309: DRILL-6477: Drillbit 
crashes with OOME (Heap) for a large WebUI query
URL: https://github.com/apache/drill/pull/1309#discussion_r194510494
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/server/rest/QueryWrapper.java
 ##
 @@ -68,8 +78,37 @@ public QueryResult run(final WorkManager workManager, final 
WebUserConnection we
 // Submit user query to Drillbit work queue.
 final QueryId queryId = 
workManager.getUserWorker().submitWork(webUserConnection, runQuery);
 
+heapMemoryFailureThreshold = 
workManager.getContext().getConfig().getDouble( 
ExecConstants.HTTP_QUERY_FAIL_LOW_HEAP_THRESHOLD );
+boolean isComplete = false;
+boolean nearlyOutOfHeapSpace = false;
+float usagePercent = getHeapUsage();
+
 // Wait until the query execution is complete or there is error submitting 
the query
-webUserConnection.await();
+logger.debug("Wait until the query execution is complete or there is error 
submitting the query");
+do {
+  try {
+isComplete = webUserConnection.await(TimeUnit.SECONDS.toMillis(1)); 
/*periodically timeout to check heap*/
+  } catch (Exception e) { }
+
+  usagePercent = getHeapUsage();
+  if (usagePercent >  heapMemoryFailureThreshold) {
+nearlyOutOfHeapSpace = true;
+  }
+} while (!isComplete && !nearlyOutOfHeapSpace);
+
+//Fail if nearly out of heap space
+if (nearlyOutOfHeapSpace) {
+  workManager.getBee().getForemanForQueryId(queryId)
+.addToEventQueue(QueryState.FAILED,
+UserException.resourceError(
+new Throwable(
+"Query submitted through the Web interface was failed due 
to diminishing free heap memory ("+ Math.floor(((1-usagePercent)*100)) +"% 
free). "
 
 Review comment:
   Oh yes. That would be useful information to add to the message. But since 
the error is triggered at 0.85, the number would always be 15% :) 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Drillbit hangs/crashes with OOME Java Heap Space for a large query through 
> WebUI
> 
>
> Key: DRILL-6477
> URL: https://issues.apache.org/jira/browse/DRILL-6477
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Web Server
>Affects Versions: 1.13.0
>Reporter: Kunal Khatua
>Assignee: Kunal Khatua
>Priority: Major
> Fix For: 1.14.0
>
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> For queries submitted through the WebUI and retrieving a large resultset, the 
> Drillbit often hangs or crashes due to the (foreman) Drillbit running out of 
> Heap memory.
> This is because the Web client translates the resultset into a massive object 
> in the heap-space and tries to send that back to the browser. This results in 
> the VM thread actively trying to perform GC if the memory is not sufficient.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6477) Drillbit hangs/crashes with OOME Java Heap Space for a large query through WebUI

2018-06-11 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16508500#comment-16508500
 ] 

ASF GitHub Bot commented on DRILL-6477:
---

kkhatua commented on a change in pull request #1309: DRILL-6477: Drillbit 
crashes with OOME (Heap) for a large WebUI query
URL: https://github.com/apache/drill/pull/1309#discussion_r194503329
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/ExecConstants.java
 ##
 @@ -204,6 +204,8 @@ private ExecConstants() {
   public static final String SERVICE_KEYTAB_LOCATION = SERVICE_LOGIN_PREFIX + 
".keytab";
   public static final String KERBEROS_NAME_MAPPING = SERVICE_LOGIN_PREFIX + 
".auth_to_local";
 
+  /* Provide resiliency on web server for queries submitted via HTTP */
+  public static final String HTTP_QUERY_FAIL_LOW_HEAP_THRESHOLD = 
"drill.exec.http.query.fail.low_heap.threshold";
 
 Review comment:
   I am inclined towards having this as an drill-override.conf property, just 
so that we have a tuning mechanism. But 85% is a reasonable threshold, so I'll 
hard-code it as a constant within QueryWrapper.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Drillbit hangs/crashes with OOME Java Heap Space for a large query through 
> WebUI
> 
>
> Key: DRILL-6477
> URL: https://issues.apache.org/jira/browse/DRILL-6477
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Web Server
>Affects Versions: 1.13.0
>Reporter: Kunal Khatua
>Assignee: Kunal Khatua
>Priority: Major
> Fix For: 1.14.0
>
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> For queries submitted through the WebUI and retrieving a large resultset, the 
> Drillbit often hangs or crashes due to the (foreman) Drillbit running out of 
> Heap memory.
> This is because the Web client translates the resultset into a massive object 
> in the heap-space and tries to send that back to the browser. This results in 
> the VM thread actively trying to perform GC if the memory is not sufficient.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6477) Drillbit hangs/crashes with OOME Java Heap Space for a large query through WebUI

2018-06-11 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16508493#comment-16508493
 ] 

ASF GitHub Bot commented on DRILL-6477:
---

kkhatua commented on a change in pull request #1309: DRILL-6477: Drillbit 
crashes with OOME (Heap) for a large WebUI query
URL: https://github.com/apache/drill/pull/1309#discussion_r194502221
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/server/rest/QueryWrapper.java
 ##
 @@ -68,8 +78,37 @@ public QueryResult run(final WorkManager workManager, final 
WebUserConnection we
 // Submit user query to Drillbit work queue.
 final QueryId queryId = 
workManager.getUserWorker().submitWork(webUserConnection, runQuery);
 
+heapMemoryFailureThreshold = 
workManager.getContext().getConfig().getDouble( 
ExecConstants.HTTP_QUERY_FAIL_LOW_HEAP_THRESHOLD );
+boolean isComplete = false;
+boolean nearlyOutOfHeapSpace = false;
+float usagePercent = getHeapUsage();
+
 // Wait until the query execution is complete or there is error submitting 
the query
-webUserConnection.await();
+logger.debug("Wait until the query execution is complete or there is error 
submitting the query");
+do {
+  try {
+isComplete = webUserConnection.await(TimeUnit.SECONDS.toMillis(1)); 
/*periodically timeout to check heap*/
+  } catch (Exception e) { }
+
+  usagePercent = getHeapUsage();
+  if (usagePercent >  heapMemoryFailureThreshold) {
+nearlyOutOfHeapSpace = true;
+  }
+} while (!isComplete && !nearlyOutOfHeapSpace);
+
+//Fail if nearly out of heap space
+if (nearlyOutOfHeapSpace) {
+  workManager.getBee().getForemanForQueryId(queryId)
+.addToEventQueue(QueryState.FAILED,
+UserException.resourceError(
+new Throwable(
+"Query submitted through the Web interface was failed due 
to diminishing free heap memory ("+ Math.floor(((1-usagePercent)*100)) +"% 
free). "
 
 Review comment:
   I thought it would be useful to show the level at which the free memory was 
prior to cancellation. I suspect that people will see a GC'ed Drillbit after 
the cancellation and wonder why Drill is complaining of no sufficient heap.  
Hence, the wording.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Drillbit hangs/crashes with OOME Java Heap Space for a large query through 
> WebUI
> 
>
> Key: DRILL-6477
> URL: https://issues.apache.org/jira/browse/DRILL-6477
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Web Server
>Affects Versions: 1.13.0
>Reporter: Kunal Khatua
>Assignee: Kunal Khatua
>Priority: Major
> Fix For: 1.14.0
>
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> For queries submitted through the WebUI and retrieving a large resultset, the 
> Drillbit often hangs or crashes due to the (foreman) Drillbit running out of 
> Heap memory.
> This is because the Web client translates the resultset into a massive object 
> in the heap-space and tries to send that back to the browser. This results in 
> the VM thread actively trying to perform GC if the memory is not sufficient.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6353) Upgrade Parquet MR dependencies

2018-06-11 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16508490#comment-16508490
 ] 

ASF GitHub Bot commented on DRILL-6353:
---

parthchandra commented on a change in pull request #1259: DRILL-6353: Upgrade 
Parquet MR dependencies
URL: https://github.com/apache/drill/pull/1259#discussion_r194501388
 
 

 ##
 File path: 
exec/java-exec/src/test/java/org/apache/drill/exec/store/parquet/TestParquetMetadataCache.java
 ##
 @@ -737,6 +738,7 @@ public void testBooleanPartitionPruning() throws Exception 
{
 }
   }
 
+  @Ignore
 
 Review comment:
   I also had an offline chat with Vlad on this one. The problem is that 
Parquet has changed its behaviour and will not give us the stats for Decimal 
when we read footers. 
   We have, therefore, no way of knowing whether Decimal stats are correct or 
not (even if they are correct) unless we try to hack something in Parquet. 
Hacking something in Parquet is not an option since that is exactly what this 
PR is trying to fix !
   Also, we have never supported Decimal in Drill, so we do not have to 
consider backward compatibility. There are some users using Decimal (based on 
posts to the mailing list), but the old implementation never worked reliably so 
this will be an overall improvement for all parties.
   
   +1. And thanks Vlad, Arina for pursuing this one to the end :)


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Upgrade Parquet MR dependencies
> ---
>
> Key: DRILL-6353
> URL: https://issues.apache.org/jira/browse/DRILL-6353
> Project: Apache Drill
>  Issue Type: Task
>Reporter: Vlad Rozov
>Assignee: Vlad Rozov
>Priority: Major
> Fix For: 1.14.0
>
> Attachments: alltypes_optional.json, fixedlenDecimal.json
>
>
> Upgrade from a custom build {{1.8.1-drill-r0}} to Apache release {{1.10.0}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6477) Drillbit hangs/crashes with OOME Java Heap Space for a large query through WebUI

2018-06-11 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16508477#comment-16508477
 ] 

ASF GitHub Bot commented on DRILL-6477:
---

parthchandra commented on a change in pull request #1309: DRILL-6477: Drillbit 
crashes with OOME (Heap) for a large WebUI query
URL: https://github.com/apache/drill/pull/1309#discussion_r194497978
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/server/rest/QueryWrapper.java
 ##
 @@ -68,8 +78,37 @@ public QueryResult run(final WorkManager workManager, final 
WebUserConnection we
 // Submit user query to Drillbit work queue.
 final QueryId queryId = 
workManager.getUserWorker().submitWork(webUserConnection, runQuery);
 
+heapMemoryFailureThreshold = 
workManager.getContext().getConfig().getDouble( 
ExecConstants.HTTP_QUERY_FAIL_LOW_HEAP_THRESHOLD );
+boolean isComplete = false;
+boolean nearlyOutOfHeapSpace = false;
+float usagePercent = getHeapUsage();
+
 // Wait until the query execution is complete or there is error submitting 
the query
-webUserConnection.await();
+logger.debug("Wait until the query execution is complete or there is error 
submitting the query");
+do {
+  try {
+isComplete = webUserConnection.await(TimeUnit.SECONDS.toMillis(1)); 
/*periodically timeout to check heap*/
+  } catch (Exception e) { }
+
+  usagePercent = getHeapUsage();
+  if (usagePercent >  heapMemoryFailureThreshold) {
+nearlyOutOfHeapSpace = true;
+  }
+} while (!isComplete && !nearlyOutOfHeapSpace);
+
+//Fail if nearly out of heap space
+if (nearlyOutOfHeapSpace) {
+  workManager.getBee().getForemanForQueryId(queryId)
+.addToEventQueue(QueryState.FAILED,
+UserException.resourceError(
+new Throwable(
+"Query submitted through the Web interface was failed due 
to diminishing free heap memory ("+ Math.floor(((1-usagePercent)*100)) +"% 
free). "
 
 Review comment:
   We could make this friendlier :)
   "There is not enough heap memory to run this query using the web interface. 
Please try a query with fewer columns or with a filter or limit condition to 
limit the data returned. You can also try an ODBC/JDBC client"


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Drillbit hangs/crashes with OOME Java Heap Space for a large query through 
> WebUI
> 
>
> Key: DRILL-6477
> URL: https://issues.apache.org/jira/browse/DRILL-6477
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Web Server
>Affects Versions: 1.13.0
>Reporter: Kunal Khatua
>Assignee: Kunal Khatua
>Priority: Major
> Fix For: 1.14.0
>
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> For queries submitted through the WebUI and retrieving a large resultset, the 
> Drillbit often hangs or crashes due to the (foreman) Drillbit running out of 
> Heap memory.
> This is because the Web client translates the resultset into a massive object 
> in the heap-space and tries to send that back to the browser. This results in 
> the VM thread actively trying to perform GC if the memory is not sufficient.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6477) Drillbit hangs/crashes with OOME Java Heap Space for a large query through WebUI

2018-06-11 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16508478#comment-16508478
 ] 

ASF GitHub Bot commented on DRILL-6477:
---

parthchandra commented on a change in pull request #1309: DRILL-6477: Drillbit 
crashes with OOME (Heap) for a large WebUI query
URL: https://github.com/apache/drill/pull/1309#discussion_r194498399
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/ExecConstants.java
 ##
 @@ -204,6 +204,8 @@ private ExecConstants() {
   public static final String SERVICE_KEYTAB_LOCATION = SERVICE_LOGIN_PREFIX + 
".keytab";
   public static final String KERBEROS_NAME_MAPPING = SERVICE_LOGIN_PREFIX + 
".auth_to_local";
 
+  /* Provide resiliency on web server for queries submitted via HTTP */
+  public static final String HTTP_QUERY_FAIL_LOW_HEAP_THRESHOLD = 
"drill.exec.http.query.fail.low_heap.threshold";
 
 Review comment:
   I would have just put this as a constant in the QueryWrapper class. I don't 
expect the user to ever modify this and when the QueryWrapper is updated to 
address large result sets, then we don't have to worry about removing this.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Drillbit hangs/crashes with OOME Java Heap Space for a large query through 
> WebUI
> 
>
> Key: DRILL-6477
> URL: https://issues.apache.org/jira/browse/DRILL-6477
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Web Server
>Affects Versions: 1.13.0
>Reporter: Kunal Khatua
>Assignee: Kunal Khatua
>Priority: Major
> Fix For: 1.14.0
>
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> For queries submitted through the WebUI and retrieving a large resultset, the 
> Drillbit often hangs or crashes due to the (foreman) Drillbit running out of 
> Heap memory.
> This is because the Web client translates the resultset into a massive object 
> in the heap-space and tries to send that back to the browser. This results in 
> the VM thread actively trying to perform GC if the memory is not sufficient.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-5735) UI options grouping and filtering & Metrics hints

2018-06-11 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-5735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16508455#comment-16508455
 ] 

ASF GitHub Bot commented on DRILL-5735:
---

parthchandra commented on issue #1279: DRILL-5735: Allow search/sort in the 
Options webUI
URL: https://github.com/apache/drill/pull/1279#issuecomment-396334998
 
 
   IMHO, code maintainability and coherence are perfectly valid requirements 
from a reviewer. In particular, having help text being out of date or 
inconsistent is a fatal problem for user facing code. If you're inclined, we 
can work on updating DRILL-4699 together.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> UI options grouping and filtering & Metrics hints
> -
>
> Key: DRILL-5735
> URL: https://issues.apache.org/jira/browse/DRILL-5735
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Web Server
>Affects Versions: 1.9.0, 1.10.0, 1.11.0
>Reporter: Muhammad Gelbana
>Assignee: Kunal Khatua
>Priority: Major
>  Labels: ready-to-commit
> Fix For: 1.14.0
>
>
> I'm thinking of some UI improvements that could make all the difference for 
> users trying to optimize low-performing queries.
> h2. Options
> h3. Grouping
> We can organize the options to be grouped by their scope of effect, this will 
> help users easily locate the options they may need to tune.
> h3. Filtering
> Since the options are a lot, we can add a filtering mechanism (i.e. string 
> search or group\scope filtering) so the user can filter out the options he's 
> not interested in. To provide more benefit than the grouping idea mentioned 
> above, filtering may include keywords also and not just the option name, 
> since the user may not be aware of the name of the option he's looking for.
> h2. Metrics
> I'm referring here to the metrics page and the query execution plan page that 
> displays the overview section and major\minor fragments metrics. We can show 
> hints for each metric such as:
> # What does it represent in more details.
> # What option\scope-of-options to tune (increase ? decrease ?) to improve the 
> performance reported by this metric.
> # May be even provide a small dialog to quickly allow the modification of the 
> related option(s) to that metric



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6454) Native MapR DB plugin support for Hive MapR-DB json table

2018-06-11 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16508446#comment-16508446
 ] 

ASF GitHub Bot commented on DRILL-6454:
---

vdiravka commented on issue #1314: DRILL-6454: Native MapR DB plugin support 
for Hive MapR-DB json table
URL: https://github.com/apache/drill/pull/1314#issuecomment-396332367
 
 
   @gparai @vrozov Could you review please this PR?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Native MapR DB plugin support for Hive MapR-DB json table
> -
>
> Key: DRILL-6454
> URL: https://issues.apache.org/jira/browse/DRILL-6454
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Storage - Hive, Storage - MapRDB
>Affects Versions: 1.13.0
>Reporter: Vitalii Diravka
>Assignee: Vitalii Diravka
>Priority: Major
> Fix For: 1.14.0
>
>
> Hive can create and query MapR-DB tables via maprdb-json-handler:
> https://maprdocs.mapr.com/home/Hive/ConnectingToMapR-DB.html
> The aim of this Jira to implement Drill native reader for Hive MapR-DB tables 
> (similar to parquet).
> Design proposal is:
> - to use JsonTableGroupScan instead of HiveScan;
> - to add storage planning rule to convert HiveScan to MapRDBGroupScan;
> - to add system/session option to enable using of this native reader;
> - native reader can be used only for Drill build with mapr profile (there is 
> no reason to leverage it for default profile);
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6353) Upgrade Parquet MR dependencies

2018-06-11 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16508407#comment-16508407
 ] 

ASF GitHub Bot commented on DRILL-6353:
---

arina-ielchiieva commented on a change in pull request #1259: DRILL-6353: 
Upgrade Parquet MR dependencies
URL: https://github.com/apache/drill/pull/1259#discussion_r194481314
 
 

 ##
 File path: 
exec/java-exec/src/test/java/org/apache/drill/exec/store/parquet/TestParquetMetadataCache.java
 ##
 @@ -737,6 +738,7 @@ public void testBooleanPartitionPruning() throws Exception 
{
 }
   }
 
+  @Ignore
 
 Review comment:
   Vlad thanks for investigating the issue. Since it's Parquet problem, it can 
leave tests to be ignored just please add comment in each of them to indicated 
the root cause. @parthchandra are you ok with this approach?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Upgrade Parquet MR dependencies
> ---
>
> Key: DRILL-6353
> URL: https://issues.apache.org/jira/browse/DRILL-6353
> Project: Apache Drill
>  Issue Type: Task
>Reporter: Vlad Rozov
>Assignee: Vlad Rozov
>Priority: Major
> Fix For: 1.14.0
>
> Attachments: alltypes_optional.json, fixedlenDecimal.json
>
>
> Upgrade from a custom build {{1.8.1-drill-r0}} to Apache release {{1.10.0}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6474) Queries with ORDER BY and OFFSET (w/o LIMIT) do not return any rows

2018-06-11 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16508282#comment-16508282
 ] 

ASF GitHub Bot commented on DRILL-6474:
---

ilooner commented on a change in pull request #1313: DRILL-6474: Don't use TopN 
when order by and offset are used without a limit specified.
URL: https://github.com/apache/drill/pull/1313#discussion_r194461574
 
 

 ##
 File path: 
exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/limit/TestLimitPlanning.java
 ##
 @@ -0,0 +1,32 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.physical.impl.limit;
+
+import org.apache.drill.PlanTestBase;
+import org.junit.Test;
+
+public class TestLimitPlanning extends PlanTestBase {
+
+  // DRILL-6474
+  @Test
+  public void dontPushdownIntoTopNWhenNoLimit() throws Exception {
+String query = "select full_name from cp.`employee.json` order by 
full_name offset 10";
+
+PlanTestBase.testPlanMatchingPatterns(query, new String[]{}, new 
String[]{".*TopN\\(.*"});
 
 Review comment:
   Done


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Queries with ORDER BY and OFFSET (w/o LIMIT) do not return any rows
> ---
>
> Key: DRILL-6474
> URL: https://issues.apache.org/jira/browse/DRILL-6474
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>Priority: Major
> Fix For: 1.14.0
>
>
> This is easily reproduced with the following test
> {code}
> final ClusterFixtureBuilder builder = new 
> ClusterFixtureBuilder(baseDirTestWatcher);
> try (ClusterFixture clusterFixture = builder.build();
>  ClientFixture clientFixture = clusterFixture.clientFixture()) {
>   clientFixture.testBuilder()
> .sqlQuery("select name_s10 from `mock`.`employees_10` order by 
> name_s10 offset 100")
> .expectsNumRecords(99900)
> .build()
> .run();
> }
> {code}
> That fails with
> java.lang.AssertionError: 
> Expected :99900
> Actual   :0



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6373) Refactor the Result Set Loader to prepare for Union, List support

2018-06-11 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16508224#comment-16508224
 ] 

ASF GitHub Bot commented on DRILL-6373:
---

vrozov commented on issue #1244: DRILL-6373: Refactor Result Set Loader for 
Union, List support
URL: https://github.com/apache/drill/pull/1244#issuecomment-396287379
 
 
   @paul-rogers Please track the stack trace for the modification part. There 
is an attempt to add a child to the children of a vector that is part of the 
`incoming` batch. Please see `PartitionerTemplate.java:381`:
   ```
   ValueVector outgoingVector = TypeHelper.getNewVector(v.getField(), 
allocator);
   ```
   The existing assumption is that `v.getField()` is read-only (immutable), but 
it is passed to `AbstractMapVector` that iterates over passed `field` children 
instead of using cloned version. Then, 
   `child` is passed to `NullableVarCharVector` that again modifies passed 
parameter.
   
   Please consider changing constructor of vectors to clone passed in field and 
using cloned version to add child:
   ```
   BaseValueVector.java (add clone()):
 protected BaseValueVector(MaterializedField field, BufferAllocator 
allocator) {
   this.field = Preconditions.checkNotNull(field, "field cannot be 
null").*clone*();
   this.allocator = Preconditions.checkNotNull(allocator, "allocator cannot 
be null");
 }
   
   NullableVarCharVector.java (use cloned version):
 public NullableVarCharVector(MaterializedField field, BufferAllocator 
allocator) {
   super(field, allocator);
   // replace field with it's clone
   *field = getField();*
   
   // The values vector has the same type and attributes
   // as the nullable vector, but with a mode of required. This ensures that
   // things like scale and precision are preserved in the values vector.
   // For backward compatibility, the values vector must have the same
   // name as the enclosing vector.
   // Setting the child to REQUIRED is a change, prior it was OPTIONAL. But
   // if we then use the field to create a vector, we get the wrong type
   // (we get an optional child vector when we want required.)
   
   values = new VarCharVector(
   MaterializedField.create(field.getName(),
   field.getType().toBuilder()
 .setMode(DataMode.REQUIRED)
 .build()),
   allocator);
   
   field.addChild(bits.getField());
   field.addChild(values.getField());
   accessor = new Accessor();
 }
   
   AbstractMapVector.java (use cloned version):
 protected AbstractMapVector(MaterializedField field, BufferAllocator 
allocator, CallBack callBack) {
   super(field.clone(), allocator, callBack);
   // replace field with it's clone
   *field = getField();*
   // create the hierarchy of the child vectors based on the materialized 
field
   for (MaterializedField child : field.getChildren()) {
 if 
(child.getName().equals(BaseRepeatedValueVector.OFFSETS_FIELD.getName())) {
   continue;
 }
 final String fieldName = child.getName();
 final ValueVector v = BasicTypeHelper.getNewVector(child, allocator, 
callBack);
 putVector(fieldName, v);
   }
 }
   ```
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Refactor the Result Set Loader to prepare for Union, List support
> -
>
> Key: DRILL-6373
> URL: https://issues.apache.org/jira/browse/DRILL-6373
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: 1.13.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
>Priority: Major
> Fix For: 1.14.0
>
>
> As the next step in merging the "batch sizing" enhancements, refactor the 
> {{ResultSetLoader}} and related classes to prepare for Union and List 
> support. This fix follows the refactoring of the column accessors for the 
> same purpose. Actual Union and List support is to follow in a separate PR.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-5796) Filter pruning for multi rowgroup parquet file

2018-06-11 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-5796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16508130#comment-16508130
 ] 

ASF GitHub Bot commented on DRILL-5796:
---

jbimbert commented on issue #1298: DRILL-5796: Filter pruning for multi 
rowgroup parquet file
URL: https://github.com/apache/drill/pull/1298#issuecomment-396262789
 
 
   Tests DRILL_6259 are now OK. 
   The issue came from a query of the form "select * from t where t.col[2] > 
1", where col[] is an array; in this case, we can't deduce the value of the 
column from the metadata of the row group, especially is col[2] is null for 
example.
   To correct this issue, I chose to set filter result as ROWS_MATCH.SOME 
instead of ROWS_MATCH.ALL for array columns only.
   This should impact only queries on array columns, which will run as usual 
(with no filter pruning).


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Filter pruning for multi rowgroup parquet file
> --
>
> Key: DRILL-5796
> URL: https://issues.apache.org/jira/browse/DRILL-5796
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - Parquet
>Reporter: Damien Profeta
>Assignee: Jean-Blas IMBERT
>Priority: Major
> Fix For: 1.14.0
>
>
> Today, filter pruning use the file name as the partitioning key. This means 
> you can remove a partition only if the whole file is for the same partition. 
> With parquet, you can prune the filter if the rowgroup make a partition of 
> your dataset as the unit of work if the rowgroup not the file.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-5796) Filter pruning for multi rowgroup parquet file

2018-06-11 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-5796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16508115#comment-16508115
 ] 

ASF GitHub Bot commented on DRILL-5796:
---

jbimbert commented on a change in pull request #1298: DRILL-5796: Filter 
pruning for multi rowgroup parquet file
URL: https://github.com/apache/drill/pull/1298#discussion_r194419476
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/expr/stat/ParquetFilterPredicate.java
 ##
 @@ -18,5 +18,17 @@
 package org.apache.drill.exec.expr.stat;
 
 public interface ParquetFilterPredicate {
-  boolean canDrop(RangeExprEvaluator evaluator);
+  /**
+   * Define the validity of a row group against a filter
+   * 
+   *   ALL : all rows match the filter (canDrop the row group = false and 
filter pruning = true)
+   *   NONE : no row matches the filter (canDrop the row group = true)
+   *   SOME : some rows only match the filter (canDrop the row group = 
false and filter pruning = false)
+   *   INAPPLICABLE : filter can not be applied
+   * 
+   */
+  enum ROWS_MATCH {ALL, NONE, SOME, INAPPLICABLE
+  }
 
 Review comment:
   Done


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Filter pruning for multi rowgroup parquet file
> --
>
> Key: DRILL-5796
> URL: https://issues.apache.org/jira/browse/DRILL-5796
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - Parquet
>Reporter: Damien Profeta
>Assignee: Jean-Blas IMBERT
>Priority: Major
> Fix For: 1.14.0
>
>
> Today, filter pruning use the file name as the partitioning key. This means 
> you can remove a partition only if the whole file is for the same partition. 
> With parquet, you can prune the filter if the rowgroup make a partition of 
> your dataset as the unit of work if the rowgroup not the file.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (DRILL-6447) Unsupported Operation when reading parquet data

2018-06-11 Thread Arina Ielchiieva (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-6447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arina Ielchiieva reassigned DRILL-6447:
---

Assignee: Vlad Rozov  (was: Arina Ielchiieva)

> Unsupported Operation when reading parquet data
> ---
>
> Key: DRILL-6447
> URL: https://issues.apache.org/jira/browse/DRILL-6447
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Parquet
>Affects Versions: 1.14.0
>Reporter: salim achouche
>Assignee: Vlad Rozov
>Priority: Major
> Fix For: 1.14.0
>
>   Original Estimate: 2h
>  Remaining Estimate: 2h
>
> An exception is thrown when reading Parquet data.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (DRILL-6447) Unsupported Operation when reading parquet data

2018-06-11 Thread Arina Ielchiieva (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-6447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arina Ielchiieva reassigned DRILL-6447:
---

Assignee: Arina Ielchiieva  (was: Vlad Rozov)

> Unsupported Operation when reading parquet data
> ---
>
> Key: DRILL-6447
> URL: https://issues.apache.org/jira/browse/DRILL-6447
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Parquet
>Affects Versions: 1.14.0
>Reporter: salim achouche
>Assignee: Arina Ielchiieva
>Priority: Major
> Fix For: 1.14.0
>
>   Original Estimate: 2h
>  Remaining Estimate: 2h
>
> An exception is thrown when reading Parquet data.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6447) Unsupported Operation when reading parquet data

2018-06-11 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16508101#comment-16508101
 ] 

ASF GitHub Bot commented on DRILL-6447:
---

arina-ielchiieva closed pull request #1291: DRILL-6447: Fixed a sanity check 
condition
URL: https://github.com/apache/drill/pull/1291
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git 
a/exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/columnreaders/PageReader.java
 
b/exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/columnreaders/PageReader.java
index bf75695b6b..a047fcc3e3 100644
--- 
a/exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/columnreaders/PageReader.java
+++ 
b/exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/columnreaders/PageReader.java
@@ -445,8 +445,8 @@ public void clear(){
* @throws IOException An IO related condition
*/
   void resetDefinitionLevelReader(int skipCount) throws IOException {
-if (parentColumnReader.columnDescriptor.getMaxDefinitionLevel() != 0) {
-  throw new UnsupportedOperationException("Unsupoorted Operation");
+if (parentColumnReader.columnDescriptor.getMaxDefinitionLevel() > 1) {
+  throw new UnsupportedOperationException("Unsupported Operation");
 }
 
 final Encoding dlEncoding = 
METADATA_CONVERTER.getEncoding(pageHeader.data_page_header.definition_level_encoding);
diff --git 
a/exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/columnreaders/VarLenBulkPageReader.java
 
b/exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/columnreaders/VarLenBulkPageReader.java
index b6205c1ef3..394f6ff863 100644
--- 
a/exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/columnreaders/VarLenBulkPageReader.java
+++ 
b/exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/columnreaders/VarLenBulkPageReader.java
@@ -72,6 +72,7 @@
   this.pageInfo.numPageFieldsRead = pageInfoInput.numPageFieldsRead;
   this.pageInfo.definitionLevels = pageInfoInput.definitionLevels;
   this.pageInfo.dictionaryValueReader = 
pageInfoInput.dictionaryValueReader;
+  this.pageInfo.numPageValues = pageInfoInput.numPageValues;
 }
 
 this.columnPrecInfo = columnPrecInfoInput;
@@ -94,6 +95,7 @@ final void set(PageDataInfo pageInfoInput) {
 pageInfo.numPageFieldsRead = pageInfoInput.numPageFieldsRead;
 pageInfo.definitionLevels = pageInfoInput.definitionLevels;
 pageInfo.dictionaryValueReader = pageInfoInput.dictionaryValueReader;
+pageInfo.numPageValues = pageInfoInput.numPageValues;
 
 buffer.clear();
   }


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Unsupported Operation when reading parquet data
> ---
>
> Key: DRILL-6447
> URL: https://issues.apache.org/jira/browse/DRILL-6447
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Parquet
>Affects Versions: 1.14.0
>Reporter: salim achouche
>Assignee: Vlad Rozov
>Priority: Major
> Fix For: 1.14.0
>
>   Original Estimate: 2h
>  Remaining Estimate: 2h
>
> An exception is thrown when reading Parquet data.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6447) Unsupported Operation when reading parquet data

2018-06-11 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16508100#comment-16508100
 ] 

ASF GitHub Bot commented on DRILL-6447:
---

arina-ielchiieva commented on issue #1291: DRILL-6447: Fixed a sanity check 
condition
URL: https://github.com/apache/drill/pull/1291#issuecomment-396256397
 
 
   Sounds good, Closing the PR.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Unsupported Operation when reading parquet data
> ---
>
> Key: DRILL-6447
> URL: https://issues.apache.org/jira/browse/DRILL-6447
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Parquet
>Affects Versions: 1.14.0
>Reporter: salim achouche
>Assignee: Vlad Rozov
>Priority: Major
> Fix For: 1.14.0
>
>   Original Estimate: 2h
>  Remaining Estimate: 2h
>
> An exception is thrown when reading Parquet data.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6447) Unsupported Operation when reading parquet data

2018-06-11 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16508102#comment-16508102
 ] 

ASF GitHub Bot commented on DRILL-6447:
---

arina-ielchiieva commented on issue #1291: DRILL-6447: Fixed a sanity check 
condition
URL: https://github.com/apache/drill/pull/1291#issuecomment-396256397
 
 
   Sounds good. Closing the PR.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Unsupported Operation when reading parquet data
> ---
>
> Key: DRILL-6447
> URL: https://issues.apache.org/jira/browse/DRILL-6447
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Parquet
>Affects Versions: 1.14.0
>Reporter: salim achouche
>Assignee: Vlad Rozov
>Priority: Major
> Fix For: 1.14.0
>
>   Original Estimate: 2h
>  Remaining Estimate: 2h
>
> An exception is thrown when reading Parquet data.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6447) Unsupported Operation when reading parquet data

2018-06-11 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16508089#comment-16508089
 ] 

ASF GitHub Bot commented on DRILL-6447:
---

vrozov commented on issue #1291: DRILL-6447: Fixed a sanity check condition
URL: https://github.com/apache/drill/pull/1291#issuecomment-396252450
 
 
   @arina-ielchiieva We agreed that this PR can be closed and I'll add fix for 
`VarLenBulkPageReader.java` to the parquet upgrade PR.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Unsupported Operation when reading parquet data
> ---
>
> Key: DRILL-6447
> URL: https://issues.apache.org/jira/browse/DRILL-6447
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Parquet
>Affects Versions: 1.14.0
>Reporter: salim achouche
>Assignee: Vlad Rozov
>Priority: Major
> Fix For: 1.14.0
>
>   Original Estimate: 2h
>  Remaining Estimate: 2h
>
> An exception is thrown when reading Parquet data.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6447) Unsupported Operation when reading parquet data

2018-06-11 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16507937#comment-16507937
 ] 

ASF GitHub Bot commented on DRILL-6447:
---

arina-ielchiieva commented on issue #1291: DRILL-6447: Fixed a sanity check 
condition
URL: https://github.com/apache/drill/pull/1291#issuecomment-396216743
 
 
   @sachouche any update based to Vlad's feedback?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Unsupported Operation when reading parquet data
> ---
>
> Key: DRILL-6447
> URL: https://issues.apache.org/jira/browse/DRILL-6447
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Parquet
>Affects Versions: 1.14.0
>Reporter: salim achouche
>Assignee: Vlad Rozov
>Priority: Major
> Fix For: 1.14.0
>
>   Original Estimate: 2h
>  Remaining Estimate: 2h
>
> An exception is thrown when reading Parquet data.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-6481) Refactor ParquetXXXPredicate classes

2018-06-11 Thread Arina Ielchiieva (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-6481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arina Ielchiieva updated DRILL-6481:

Labels: ready-to-commit  (was: )

> Refactor ParquetXXXPredicate classes
> 
>
> Key: DRILL-6481
> URL: https://issues.apache.org/jira/browse/DRILL-6481
> Project: Apache Drill
>  Issue Type: Task
>Reporter: Vlad Rozov
>Assignee: Vlad Rozov
>Priority: Minor
>  Labels: ready-to-commit
> Fix For: 1.14.0
>
>
> Refactor ParquetXXXPredicate classes to avoid code duplication and add type 
> info to Statistics.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-6365) Drill cannot query files in parallel in local file system in REST API

2018-06-11 Thread mehran (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-6365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mehran updated DRILL-6365:
--
Description: 
I run two queries and it is run sequentially. Result is returned for first 
query then result of second query is returned.

I 'm running drill in one node filesystem storage and I have parquet files for 
query.

 

After some digging I discovered that Drill Rest API is locked during an 
execution of a request and runs requests sequentially. Is it necessary?

 

  was:
I run two queries and it is run sequentially. Result is returned for first 
query then result of second query is returned.

I 'm running drill in one node filesystem storage and I have parquet files for 
query.

 

Besides Drill http server is slow.  I do not know if it relates but  in recent 
two versions of 1.12 and 1.13 , I have a long delay on getting first page 
without any authentication enabled.

 

Summary: Drill cannot query files in parallel in local file system in 
REST API  (was: Drill cannot query files in parallel in local file system)

> Drill cannot query files in parallel in local file system in REST API
> -
>
> Key: DRILL-6365
> URL: https://issues.apache.org/jira/browse/DRILL-6365
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning  Optimization
>Affects Versions: 1.13.0
>Reporter: mehran
>Priority: Major
>
> I run two queries and it is run sequentially. Result is returned for first 
> query then result of second query is returned.
> I 'm running drill in one node filesystem storage and I have parquet files 
> for query.
>  
> After some digging I discovered that Drill Rest API is locked during an 
> execution of a request and runs requests sequentially. Is it necessary?
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-4682) Allow full schema identifier in SELECT clause

2018-06-11 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-4682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16507750#comment-16507750
 ] 

ASF GitHub Bot commented on DRILL-4682:
---

vdiravka commented on issue #549: DRILL-4682: Allow full schema identifier in 
SELECT clause
URL: https://github.com/apache/drill/pull/549#issuecomment-396147217
 
 
   @ilooner Yes, I am. It is in my backlog. The last status for it in my recent 
[comment](https://issues.apache.org/jira/browse/DRILL-4682?focusedCommentId=16483749=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16483749)
 for DRILL-4682.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Allow full schema identifier in SELECT clause
> -
>
> Key: DRILL-4682
> URL: https://issues.apache.org/jira/browse/DRILL-4682
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: SQL Parser
>Reporter: Andries Engelbrecht
>Assignee: Vitalii Diravka
>Priority: Major
>
> Currently Drill requires aliases to identify columns in the SELECT clause 
> when working with multiple tables/workspaces.
> Many BI/Analytical and other tools by default will use the full schema 
> identifier in the select clause when generating SQL statements for execution 
> for generic JDBC or ODBC sources. Not supporting this feature causes issues 
> and a slower adoption of utilizing Drill as an execution engine within the 
> larger Analytical SQL community.
> Propose to support 
> SELECT ... FROM 
> ..
> Also see DRILL-3510 for double quote support as per ANSI_QUOTES
> SELECT ""."".""."" FROM 
> ""."".""
> Which is very common generic SQL being generated by most tools when dealing 
> with a generic SQL data source.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6455) JDBC Scan Operator does not appear in profile

2018-06-11 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16507730#comment-16507730
 ] 

ASF GitHub Bot commented on DRILL-6455:
---

kkhatua commented on issue #1297: DRILL-6455: Add missing JDBC Scan Operator 
for profiles
URL: https://github.com/apache/drill/pull/1297#issuecomment-396137502
 
 
   @sohami  Rebased this on top of DRILL-6459's protobuf changes. This PR now 
includes the generated files for C++ client as welll.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> JDBC Scan Operator does not appear in profile
> -
>
> Key: DRILL-6455
> URL: https://issues.apache.org/jira/browse/DRILL-6455
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - JDBC
>Affects Versions: 1.13.0
>Reporter: Kunal Khatua
>Assignee: Kunal Khatua
>Priority: Critical
> Fix For: 1.14.0
>
>
> It seems that the Operator is not defined, though it appears in the text plan.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)