date:20151215

Re: PLEASE VOTE NOW: proposals for Hadoop Summit EMEA

2015-12-15 Thread Nathan Griffith

Here's a link to all the Hadoop Summit talks returned by a 'Drill'
search: https://hadoopsummit.uservoice.com/search?filter=ideas=drill

--Nathan

On Tue, Dec 15, 2015 at 12:57 AM, Ellen Friedman
 wrote:
> Please VOTE NOW and pass the word to vote for Drill preso at Hadoop Summit
> EMEA
>
> Deadline today = 15 Dec
>
> Just click link and vote
>
> https://hadoopsummit.uservoice.com/forums/332073-hadoop-application-development-dev-languages-scr/suggestions/10848006-drilling-into-data-with-apache-drill
>
> Thank you,
> Ellen

Hangout Starting

2015-12-15 Thread Jacques Nadeau

https://plus.google.com/hangouts/_/dremio.com/drillhangout?authuser=0

Drill with HA hadoop

2015-12-15 Thread Cody Stevens

Hello,   I have been having an issue with drill when my Namenode fails over
the hdfs storage will report the files as "not found" until I update the
storage configuration to the now active namenode.  I have tried adding both
namenodes ( comma separated), which drill does not like.  I have also tried
adding the HA namespace which Drill complains about a hostname not found.
I have searched through the documentation and searched online as well but
have not found the solution.  Am I missing something simple?  Since Drill
uses zookeeper it should be able to find the active namenode.  I am using
CDH 5.4.0 with YARN.

Thanks!

Cody

[GitHub] drill pull request: DRILL-4194: Improve the performance of metadat...

2015-12-15 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/drill/pull/301


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

Re: Question about the RecordIterator

2015-12-15 Thread Abdel Hakim Deneche

I see, it's in RecordIterator.mark()

On Tue, Dec 15, 2015 at 11:50 AM, Abdel Hakim Deneche  wrote:

> Amit,
>
> thanks for the prompt answer. Can you point me, in the code, where the
> purge is done ?
>
>
>
> On Tue, Dec 15, 2015 at 11:42 AM, Amit Hadke  wrote:
>
>> Hi Hakim,
>> RecordIterator will not hold all batches in memory. It holds batches from
>> last mark() operation.
>> It will purge batches as join moves along.
>>
>> Worst case case is when there are lots of repeating values on right side
>> which iterator will hold in memory.
>>
>> ~ Amit.
>>
>> On Tue, Dec 15, 2015 at 11:23 AM, Abdel Hakim Deneche <
>> adene...@maprtech.com
>> > wrote:
>>
>> > Amit,
>> >
>> > I am looking at DRILL-4190 where one of the sort operators is hitting
>> it's
>> > allocator limit when it's sending data downstream. This generally happen
>> > when a downstream operator is holding those batches in memory (e.g.
>> Window
>> > Operator).
>> >
>> > The same query is running fine on 1.2.0 which seems to suggest that the
>> > recent changes to MergeJoinBatch "may" be causing the issue.
>> >
>> > It looks like RecordIterator is holding all incoming batches into a
>> > TreeRangeMap and if I'm not mistaken it doesn't release anything until
>> it's
>> > closed. Is this correct ?
>> >
>> > I am not familiar with how merge join used to work before
>> RecordIterator.
>> > Was it also the case that we hold all incoming batches in memory ?
>> >
>> > Thanks
>> >
>> > --
>> >
>> > Abdelhakim Deneche
>> >
>> > Software Engineer
>> >
>> >   
>> >
>> >
>> > Now Available - Free Hadoop On-Demand Training
>> > <
>> >
>> http://www.mapr.com/training?utm_source=Email_medium=Signature_campaign=Free%20available
>> > >
>> >
>>
>
>
>
> --
>
> Abdelhakim Deneche
>
> Software Engineer
>
>   
>
>
> Now Available - Free Hadoop On-Demand Training
> 
>



-- 

Abdelhakim Deneche

Software Engineer

  


Now Available - Free Hadoop On-Demand Training

Re: Question about the RecordIterator

2015-12-15 Thread Abdel Hakim Deneche

Amit,

thanks for the prompt answer. Can you point me, in the code, where the
purge is done ?



On Tue, Dec 15, 2015 at 11:42 AM, Amit Hadke  wrote:

> Hi Hakim,
> RecordIterator will not hold all batches in memory. It holds batches from
> last mark() operation.
> It will purge batches as join moves along.
>
> Worst case case is when there are lots of repeating values on right side
> which iterator will hold in memory.
>
> ~ Amit.
>
> On Tue, Dec 15, 2015 at 11:23 AM, Abdel Hakim Deneche <
> adene...@maprtech.com
> > wrote:
>
> > Amit,
> >
> > I am looking at DRILL-4190 where one of the sort operators is hitting
> it's
> > allocator limit when it's sending data downstream. This generally happen
> > when a downstream operator is holding those batches in memory (e.g.
> Window
> > Operator).
> >
> > The same query is running fine on 1.2.0 which seems to suggest that the
> > recent changes to MergeJoinBatch "may" be causing the issue.
> >
> > It looks like RecordIterator is holding all incoming batches into a
> > TreeRangeMap and if I'm not mistaken it doesn't release anything until
> it's
> > closed. Is this correct ?
> >
> > I am not familiar with how merge join used to work before RecordIterator.
> > Was it also the case that we hold all incoming batches in memory ?
> >
> > Thanks
> >
> > --
> >
> > Abdelhakim Deneche
> >
> > Software Engineer
> >
> >   
> >
> >
> > Now Available - Free Hadoop On-Demand Training
> > <
> >
> http://www.mapr.com/training?utm_source=Email_medium=Signature_campaign=Free%20available
> > >
> >
>



-- 

Abdelhakim Deneche

Software Engineer

  


Now Available - Free Hadoop On-Demand Training

Re: Question about the RecordIterator

2015-12-15 Thread Amit Hadke

Hi Hakim,
RecordIterator will not hold all batches in memory. It holds batches from
last mark() operation.
It will purge batches as join moves along.

Worst case case is when there are lots of repeating values on right side
which iterator will hold in memory.

~ Amit.

On Tue, Dec 15, 2015 at 11:23 AM, Abdel Hakim Deneche  wrote:

> Amit,
>
> I am looking at DRILL-4190 where one of the sort operators is hitting it's
> allocator limit when it's sending data downstream. This generally happen
> when a downstream operator is holding those batches in memory (e.g. Window
> Operator).
>
> The same query is running fine on 1.2.0 which seems to suggest that the
> recent changes to MergeJoinBatch "may" be causing the issue.
>
> It looks like RecordIterator is holding all incoming batches into a
> TreeRangeMap and if I'm not mistaken it doesn't release anything until it's
> closed. Is this correct ?
>
> I am not familiar with how merge join used to work before RecordIterator.
> Was it also the case that we hold all incoming batches in memory ?
>
> Thanks
>
> --
>
> Abdelhakim Deneche
>
> Software Engineer
>
>   
>
>
> Now Available - Free Hadoop On-Demand Training
> <
> http://www.mapr.com/training?utm_source=Email_medium=Signature_campaign=Free%20available
> >
>

Re: Question about the RecordIterator

2015-12-15 Thread Amit Hadke

Yup that may be it. I'll add an option to not hold on to left side iterator
batches.

On Tue, Dec 15, 2015 at 11:56 AM, Abdel Hakim Deneche  wrote:

> RecordIterator.mark() is only called for the right side of the merge join.
> How about the left side, de we ever release the batches on the left side ?
> In 4190 the sort that runs out of memory is on the left side of the merge.
>
> On Tue, Dec 15, 2015 at 11:51 AM, Abdel Hakim Deneche <
> adene...@maprtech.com
> > wrote:
>
> > I see, it's in RecordIterator.mark()
> >
> > On Tue, Dec 15, 2015 at 11:50 AM, Abdel Hakim Deneche <
> > adene...@maprtech.com> wrote:
> >
> >> Amit,
> >>
> >> thanks for the prompt answer. Can you point me, in the code, where the
> >> purge is done ?
> >>
> >>
> >>
> >> On Tue, Dec 15, 2015 at 11:42 AM, Amit Hadke 
> >> wrote:
> >>
> >>> Hi Hakim,
> >>> RecordIterator will not hold all batches in memory. It holds batches
> from
> >>> last mark() operation.
> >>> It will purge batches as join moves along.
> >>>
> >>> Worst case case is when there are lots of repeating values on right
> side
> >>> which iterator will hold in memory.
> >>>
> >>> ~ Amit.
> >>>
> >>> On Tue, Dec 15, 2015 at 11:23 AM, Abdel Hakim Deneche <
> >>> adene...@maprtech.com
> >>> > wrote:
> >>>
> >>> > Amit,
> >>> >
> >>> > I am looking at DRILL-4190 where one of the sort operators is hitting
> >>> it's
> >>> > allocator limit when it's sending data downstream. This generally
> >>> happen
> >>> > when a downstream operator is holding those batches in memory (e.g.
> >>> Window
> >>> > Operator).
> >>> >
> >>> > The same query is running fine on 1.2.0 which seems to suggest that
> the
> >>> > recent changes to MergeJoinBatch "may" be causing the issue.
> >>> >
> >>> > It looks like RecordIterator is holding all incoming batches into a
> >>> > TreeRangeMap and if I'm not mistaken it doesn't release anything
> until
> >>> it's
> >>> > closed. Is this correct ?
> >>> >
> >>> > I am not familiar with how merge join used to work before
> >>> RecordIterator.
> >>> > Was it also the case that we hold all incoming batches in memory ?
> >>> >
> >>> > Thanks
> >>> >
> >>> > --
> >>> >
> >>> > Abdelhakim Deneche
> >>> >
> >>> > Software Engineer
> >>> >
> >>> >   
> >>> >
> >>> >
> >>> > Now Available - Free Hadoop On-Demand Training
> >>> > <
> >>> >
> >>>
> http://www.mapr.com/training?utm_source=Email_medium=Signature_campaign=Free%20available
> >>> > >
> >>> >
> >>>
> >>
> >>
> >>
> >> --
> >>
> >> Abdelhakim Deneche
> >>
> >> Software Engineer
> >>
> >>   
> >>
> >>
> >> Now Available - Free Hadoop On-Demand Training
> >> <
> http://www.mapr.com/training?utm_source=Email_medium=Signature_campaign=Free%20available
> >
> >>
> >
> >
> >
> > --
> >
> > Abdelhakim Deneche
> >
> > Software Engineer
> >
> >   
> >
> >
> > Now Available - Free Hadoop On-Demand Training
> > <
> http://www.mapr.com/training?utm_source=Email_medium=Signature_campaign=Free%20available
> >
> >
>
>
>
> --
>
> Abdelhakim Deneche
>
> Software Engineer
>
>   
>
>
> Now Available - Free Hadoop On-Demand Training
> <
> http://www.mapr.com/training?utm_source=Email_medium=Signature_campaign=Free%20available
> >
>

Re: Question about the RecordIterator

2015-12-15 Thread Abdel Hakim Deneche

Ok, thanks. I will add a comment to the JIRA and assign it to you ;)

On Tue, Dec 15, 2015 at 12:02 PM, Amit Hadke  wrote:

> Yup that may be it. I'll add an option to not hold on to left side iterator
> batches.
>
> On Tue, Dec 15, 2015 at 11:56 AM, Abdel Hakim Deneche <
> adene...@maprtech.com
> > wrote:
>
> > RecordIterator.mark() is only called for the right side of the merge
> join.
> > How about the left side, de we ever release the batches on the left side
> ?
> > In 4190 the sort that runs out of memory is on the left side of the
> merge.
> >
> > On Tue, Dec 15, 2015 at 11:51 AM, Abdel Hakim Deneche <
> > adene...@maprtech.com
> > > wrote:
> >
> > > I see, it's in RecordIterator.mark()
> > >
> > > On Tue, Dec 15, 2015 at 11:50 AM, Abdel Hakim Deneche <
> > > adene...@maprtech.com> wrote:
> > >
> > >> Amit,
> > >>
> > >> thanks for the prompt answer. Can you point me, in the code, where the
> > >> purge is done ?
> > >>
> > >>
> > >>
> > >> On Tue, Dec 15, 2015 at 11:42 AM, Amit Hadke 
> > >> wrote:
> > >>
> > >>> Hi Hakim,
> > >>> RecordIterator will not hold all batches in memory. It holds batches
> > from
> > >>> last mark() operation.
> > >>> It will purge batches as join moves along.
> > >>>
> > >>> Worst case case is when there are lots of repeating values on right
> > side
> > >>> which iterator will hold in memory.
> > >>>
> > >>> ~ Amit.
> > >>>
> > >>> On Tue, Dec 15, 2015 at 11:23 AM, Abdel Hakim Deneche <
> > >>> adene...@maprtech.com
> > >>> > wrote:
> > >>>
> > >>> > Amit,
> > >>> >
> > >>> > I am looking at DRILL-4190 where one of the sort operators is
> hitting
> > >>> it's
> > >>> > allocator limit when it's sending data downstream. This generally
> > >>> happen
> > >>> > when a downstream operator is holding those batches in memory (e.g.
> > >>> Window
> > >>> > Operator).
> > >>> >
> > >>> > The same query is running fine on 1.2.0 which seems to suggest that
> > the
> > >>> > recent changes to MergeJoinBatch "may" be causing the issue.
> > >>> >
> > >>> > It looks like RecordIterator is holding all incoming batches into a
> > >>> > TreeRangeMap and if I'm not mistaken it doesn't release anything
> > until
> > >>> it's
> > >>> > closed. Is this correct ?
> > >>> >
> > >>> > I am not familiar with how merge join used to work before
> > >>> RecordIterator.
> > >>> > Was it also the case that we hold all incoming batches in memory ?
> > >>> >
> > >>> > Thanks
> > >>> >
> > >>> > --
> > >>> >
> > >>> > Abdelhakim Deneche
> > >>> >
> > >>> > Software Engineer
> > >>> >
> > >>> >   
> > >>> >
> > >>> >
> > >>> > Now Available - Free Hadoop On-Demand Training
> > >>> > <
> > >>> >
> > >>>
> >
> http://www.mapr.com/training?utm_source=Email_medium=Signature_campaign=Free%20available
> > >>> > >
> > >>> >
> > >>>
> > >>
> > >>
> > >>
> > >> --
> > >>
> > >> Abdelhakim Deneche
> > >>
> > >> Software Engineer
> > >>
> > >>   
> > >>
> > >>
> > >> Now Available - Free Hadoop On-Demand Training
> > >> <
> >
> http://www.mapr.com/training?utm_source=Email_medium=Signature_campaign=Free%20available
> > >
> > >>
> > >
> > >
> > >
> > > --
> > >
> > > Abdelhakim Deneche
> > >
> > > Software Engineer
> > >
> > >   
> > >
> > >
> > > Now Available - Free Hadoop On-Demand Training
> > > <
> >
> http://www.mapr.com/training?utm_source=Email_medium=Signature_campaign=Free%20available
> > >
> > >
> >
> >
> >
> > --
> >
> > Abdelhakim Deneche
> >
> > Software Engineer
> >
> >   
> >
> >
> > Now Available - Free Hadoop On-Demand Training
> > <
> >
> http://www.mapr.com/training?utm_source=Email_medium=Signature_campaign=Free%20available
> > >
> >
>



-- 

Abdelhakim Deneche

Software Engineer

  


Now Available - Free Hadoop On-Demand Training

Question about the RecordIterator

2015-12-15 Thread Abdel Hakim Deneche

Amit,

I am looking at DRILL-4190 where one of the sort operators is hitting it's
allocator limit when it's sending data downstream. This generally happen
when a downstream operator is holding those batches in memory (e.g. Window
Operator).

The same query is running fine on 1.2.0 which seems to suggest that the
recent changes to MergeJoinBatch "may" be causing the issue.

It looks like RecordIterator is holding all incoming batches into a
TreeRangeMap and if I'm not mistaken it doesn't release anything until it's
closed. Is this correct ?

I am not familiar with how merge join used to work before RecordIterator.
Was it also the case that we hold all incoming batches in memory ?

Thanks

-- 

Abdelhakim Deneche

Software Engineer

  


Now Available - Free Hadoop On-Demand Training

Re: Question about the RecordIterator

2015-12-15 Thread Abdel Hakim Deneche

RecordIterator.mark() is only called for the right side of the merge join.
How about the left side, de we ever release the batches on the left side ?
In 4190 the sort that runs out of memory is on the left side of the merge.

On Tue, Dec 15, 2015 at 11:51 AM, Abdel Hakim Deneche  wrote:

> I see, it's in RecordIterator.mark()
>
> On Tue, Dec 15, 2015 at 11:50 AM, Abdel Hakim Deneche <
> adene...@maprtech.com> wrote:
>
>> Amit,
>>
>> thanks for the prompt answer. Can you point me, in the code, where the
>> purge is done ?
>>
>>
>>
>> On Tue, Dec 15, 2015 at 11:42 AM, Amit Hadke 
>> wrote:
>>
>>> Hi Hakim,
>>> RecordIterator will not hold all batches in memory. It holds batches from
>>> last mark() operation.
>>> It will purge batches as join moves along.
>>>
>>> Worst case case is when there are lots of repeating values on right side
>>> which iterator will hold in memory.
>>>
>>> ~ Amit.
>>>
>>> On Tue, Dec 15, 2015 at 11:23 AM, Abdel Hakim Deneche <
>>> adene...@maprtech.com
>>> > wrote:
>>>
>>> > Amit,
>>> >
>>> > I am looking at DRILL-4190 where one of the sort operators is hitting
>>> it's
>>> > allocator limit when it's sending data downstream. This generally
>>> happen
>>> > when a downstream operator is holding those batches in memory (e.g.
>>> Window
>>> > Operator).
>>> >
>>> > The same query is running fine on 1.2.0 which seems to suggest that the
>>> > recent changes to MergeJoinBatch "may" be causing the issue.
>>> >
>>> > It looks like RecordIterator is holding all incoming batches into a
>>> > TreeRangeMap and if I'm not mistaken it doesn't release anything until
>>> it's
>>> > closed. Is this correct ?
>>> >
>>> > I am not familiar with how merge join used to work before
>>> RecordIterator.
>>> > Was it also the case that we hold all incoming batches in memory ?
>>> >
>>> > Thanks
>>> >
>>> > --
>>> >
>>> > Abdelhakim Deneche
>>> >
>>> > Software Engineer
>>> >
>>> >   
>>> >
>>> >
>>> > Now Available - Free Hadoop On-Demand Training
>>> > <
>>> >
>>> http://www.mapr.com/training?utm_source=Email_medium=Signature_campaign=Free%20available
>>> > >
>>> >
>>>
>>
>>
>>
>> --
>>
>> Abdelhakim Deneche
>>
>> Software Engineer
>>
>>   
>>
>>
>> Now Available - Free Hadoop On-Demand Training
>> 
>>
>
>
>
> --
>
> Abdelhakim Deneche
>
> Software Engineer
>
>   
>
>
> Now Available - Free Hadoop On-Demand Training
> 
>



-- 

Abdelhakim Deneche

Software Engineer

  


Now Available - Free Hadoop On-Demand Training

PLEASE VOTE NOW: proposals for Hadoop Summit EMEA

2015-12-15 Thread Ellen Friedman

Please VOTE NOW and pass the word to vote for Drill preso at Hadoop Summit
EMEA

Deadline today = 15 Dec

Just click link and vote

https://hadoopsummit.uservoice.com/forums/332073-hadoop-application-development-dev-languages-scr/suggestions/10848006-drilling-into-data-with-apache-drill

Thank you,
Ellen

[jira] [Created] (DRILL-4199) Add Support for HBase 1.X

2015-12-15 Thread Divjot singh (JIRA)

Divjot singh created DRILL-4199:
---

 Summary: Add Support for HBase 1.X
 Key: DRILL-4199
 URL: https://issues.apache.org/jira/browse/DRILL-4199
 Project: Apache Drill
  Issue Type: New Feature
  Components: Storage - HBase
Affects Versions: Future
Reporter: Divjot singh


Is there any Road map to upgrade the Hbase version to 1.x series. Currently 
drill supports Hbase 0.98 version.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Re: Naming the new ValueVector Initiative

2015-12-15 Thread Wes McKinney

For now I have presumptuously moved my C++ prototype to

https://github.com/arrow-data/arrow

I may have some cycles for this over the next few weeks -- it would be
great to develop a draft of the IPC protocol for transmitting table /
row batch metadata and data headers. I am going to be working on
building up enough tools and scaffolding to start assembling a
pandas.DataFrame-like Python wrapper layer which will keep me busy for
a fair while.

Let's decide soon whether we want 1 repo or multiple repos for the
reference implementations (C/C++ and Java). 1 repo might be easier for
integration testing.

I can convert the Google doc spec floating around to Markdown and
perhaps we can discuss specific details in GitHub issues? I'll use a
separate repo for the format docs.

best,
Wes

On Mon, Dec 14, 2015 at 9:43 AM, Wes McKinney  wrote:
> hi folks,
>
> In the interim I created a new public GitHub organization to host code
> for this effort so we can organize ourselves in advance of more
> progress in the ASF:
>
> https://github.com/arrow-data
>
> I have a partial C++ implementation of the Arrow spec that I can move
> there, along with a to-be-Markdown-ified version of a specification
> subject to more iteration. The more pressing short term matter will be
> making some progress on the metadata / data headers / IPC protocol
> (e.g. using Flatbuffers or the like).
>
> Thoughts on git repo structure?
>
> 1) Avro-style — "one repo to rule them all"
> 2) Parquet-style — arrow-format, arrow-cpp, arrow-java, etc.
>
> (I'm personally more in the latter camp, though integration tests may
> be more tedious that way)
>
> Thanks
>
> On Thu, Dec 3, 2015 at 4:18 PM, Jacques Nadeau  wrote:
>> I've opened a name search for our top vote getter.
>>
>> https://issues.apache.org/jira/browse/PODLINGNAMESEARCH-92
>>
>>
>> I also just realized that my previously email dropped other recipients.
>> Here it is below.
>>
>> 
>> I think we can call the voting closed. Top vote getters:
>>
>> Apache Arrow (17)
>> Apache Herringbone (9)
>> Apache Joist (8)
>> Apache Colbuf (8)
>>
>> I'll up a PODLINGNAMESEARCH-* shortly for Arrow.
>>
>> ---
>>
>>
>>
>>
>>
>>
>> --
>> Jacques Nadeau
>> CTO and Co-Founder, Dremio
>>
>> On Thu, Dec 3, 2015 at 1:23 AM, Marcel Kornacker 
>> wrote:
>>>
>>> Just added my vote.
>>>
>>> On Thu, Dec 3, 2015 at 12:51 PM, Wes McKinney  wrote:
>>> > Shall we call the voting closed? Any last stragglers?
>>> >
>>> > On Tue, Dec 1, 2015 at 5:39 PM, Ted Dunning 
>>> > wrote:
>>> >>
>>> >> Apache can handle this if we set the groundwork in place.
>>> >>
>>> >> Also, Twitter's lawyers work for Twitter, not for Apache. As such,
>>> >> their
>>> >> opinions can't be taken by Apache as legal advice.  There are issues of
>>> >> privilege, conflict of interest and so on.
>>> >>
>>> >>
>>> >>
>>> >> On Wed, Dec 2, 2015 at 7:51 AM, Alex Levenson
>>> >> 
>>> >> wrote:
>>> >>>
>>> >>> I can ask about whether Twitter's lawyers can help out -- is that
>>> >>> something we need? Or is that something apache helps out with in the
>>> >>> next
>>> >>> step?
>>> >>>
>>> >>> On Mon, Nov 30, 2015 at 9:32 PM, Julian Hyde  wrote:
>>> 
>>>  +1 to have a vote tomorrow.
>>> 
>>>  Assuming that Vector is out of play, I just did a quick search for
>>>  the
>>>  top 4 remaining, (“arrow”, “honeycomb”, “herringbone”, “joist"), at
>>>  sourceforge, open hub, trademarkia, and on google. There are no
>>>  trademarks
>>>  for these in similar subject areas. There is a moderately active
>>>  project
>>>  called “joist” [1].
>>> 
>>>  I will point out that “Apache Arrow” has native-american connotations
>>>  that we may or may not want to live with (just ask the Washington
>>>  Redskins
>>>  how they feel about their name).
>>> 
>>>  If someone would like to vet other names, use the links on
>>>  https://issues.apache.org/jira/browse/PODLINGNAMESEARCH-90, and fill
>>>  out
>>>  column C in the spreadsheet.
>>> 
>>>  Julian
>>> 
>>>  [1] https://github.com/stephenh/joist
>>> 
>>> 
>>>  On Nov 30, 2015, at 7:01 PM, Jacques Nadeau 
>>>  wrote:
>>> 
>>>  +1
>>> 
>>>  --
>>>  Jacques Nadeau
>>>  CTO and Co-Founder, Dremio
>>> 
>>>  On Mon, Nov 30, 2015 at 6:34 PM, Wes McKinney 
>>>  wrote:
>>> 
>>>  Should we have a last call for votes, closing EOD tomorrow (Tuesday)?
>>>  I
>>>  missed this for a few days last week with holiday travel.
>>> 
>>>  On Thu, Nov 26, 2015 at 3:04 PM, Julian Hyde 
>>>  wrote:
>>> 
>>>  Consulting a lawyer is part of the Apache branding process but the
>>>  first
>>>  stage is to gather a list of potential conflicts -
>>>

Re: Unittest failure on master

2015-12-15 Thread Amit Hadke

Hey Guys,

I'm not able to reproduce same issue and test doesn't seem to be doing
anything.

Can someone run "mvn -Dtest=TestTopNSchemaChanges#testMissingColumn test"
and see if it fails?

On Mon, Dec 14, 2015 at 11:51 PM, Amit Hadke  wrote:

> This seems like  a bug in topn code than test.
> We are expecting sorted by kl2 (descending) so that non null values come
> up on top.
> Results seems to be have nulls on top.
>
> ~ Amit.
>
> On Mon, Dec 14, 2015 at 11:27 PM, Jason Altekruse <
> altekruseja...@gmail.com> wrote:
>
>> Seems weird that the results would be different based on reading order, as
>> the queries themselves contain an order by. Do we return different types
>> out of the sort depending on which schema we get first? Is this
>> intentional?
>>
>> - Jason
>>
>> On Mon, Dec 14, 2015 at 6:06 PM, Steven Phillips 
>> wrote:
>>
>> > I just did a build a linux box, and didn't see this failure. My guess is
>> > that it fails depending on which order the files are read.
>> >
>> > On Mon, Dec 14, 2015 at 5:38 PM, Venki Korukanti <
>> > venki.koruka...@gmail.com>
>> > wrote:
>> >
>> > > Is anyone else seeing below failure on latest master? I am running it
>> on
>> > > Linux.
>> > >
>> > >
>> > >
>> >
>> testMissingColumn(org.apache.drill.exec.physical.impl.TopN.TestTopNSchemaChanges)
>> > >  Time elapsed: 2.537 sec  <<< ERROR!
>> > > java.lang.Exception: unexpected null at position 0 column '`vl2`'
>> should
>> > > have been:  299
>> > >
>> > > Expected Records near verification failure:
>> > > Record Number: 0 { `kl1` : null,`kl2` : 299,`vl2` : 299,`vl1` :
>> > null,`vl` :
>> > > null,`kl` : null, }
>> > > Record Number: 1 { `kl1` : null,`kl2` : 298,`vl2` : 298,`vl1` :
>> > null,`vl` :
>> > > null,`kl` : null, }
>> > > Record Number: 2 { `kl1` : null,`kl2` : 297,`vl2` : 297,`vl1` :
>> > null,`vl` :
>> > > null,`kl` : null, }
>> > >
>> > >
>> > > Actual Records near verification failure:
>> > > Record Number: 0 { `kl1` : null,`vl2` : null,`kl2` : null,`vl1` :
>> > null,`vl`
>> > > : 100.0,`kl` : 100.0, }
>> > > Record Number: 1 { `kl1` : null,`vl2` : null,`kl2` : null,`vl1` :
>> > null,`vl`
>> > > : 101.0,`kl` : 101.0, }
>> > > Record Number: 2 { `kl1` : null,`vl2` : null,`kl2` : null,`vl1` :
>> > null,`vl`
>> > > : 102.0,`kl` : 102.0, }
>> > >
>> > > For query: select kl, vl, kl1, vl1, kl2, vl2 from
>> > >
>> > >
>> >
>> dfs_test.`/root/drill/exec/java-exec/target/1450142361702-0/topn-schemachanges`
>> > > order by kl2 desc limit 3
>> > > at
>> > >
>> > >
>> >
>> org.apache.drill.DrillTestWrapper.compareValuesErrorOnMismatch(DrillTestWrapper.java:512)
>> > > at
>> > >
>> > >
>> >
>> org.apache.drill.DrillTestWrapper.compareMergedVectors(DrillTestWrapper.java:170)
>> > > at
>> > >
>> > >
>> >
>> org.apache.drill.DrillTestWrapper.compareMergedOnHeapVectors(DrillTestWrapper.java:397)
>> > > at
>> > >
>> > >
>> >
>> org.apache.drill.DrillTestWrapper.compareOrderedResults(DrillTestWrapper.java:352)
>> > > at
>> > org.apache.drill.DrillTestWrapper.run(DrillTestWrapper.java:124)
>> > > at org.apache.drill.TestBuilder.go(TestBuilder.java:129)
>> > > at
>> > >
>> > >
>> >
>> org.apache.drill.exec.physical.impl.TopN.TestTopNSchemaChanges.testMissingColumn(TestTopNSchemaChanges.java:206)
>> > >
>> > >
>> > > Results :
>> > >
>> > > Tests in error:
>> > >   TestTopNSchemaChanges.testMissingColumn:206 »  unexpected null at
>> > > position 0 c...
>> > >
>> > > Tests run: 4, Failures: 0, Errors: 1, Skipped: 0
>> > >
>> >
>>
>
>

[GitHub] drill pull request: Add doc for Select with options

2015-12-15 Thread krishahn

Github user krishahn commented on the pull request:

https://github.com/apache/drill/pull/290#issuecomment-164914842
  
Bridget added a new section 
https://drill.apache.org/docs/plugin-configuration-basics/#using-the-formats-attributes-as-table-function-parameters
 to cover this PR. Thanks, Julien.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

Re: Unittest failure on master

2015-12-15 Thread Jason Altekruse

Amit,

The message out of the test framework tries to provide enough information
to debug even if the issues isn't reproducible in your environment. Can you
think of any reason why it might be giving the different results shown in
the message if the order of the batches changed?

If you need to change the order yourself there are two hacky approaches you
could do. Try changing the names or saving the files in a different order
 to make the FS give them back to you in a different order. You also could
just combine together the files and adjust the batch cutoff number used in
the json reader, with various ordering of the records in different versions
of the dataset.

As I write this I realize that combining the files will change the behavior
of the read. with the first batch giving a single type and later ones
giving a union type. As opposed to the multiple files approach which would
produce a bunch of different individual types and make the sort operation
generate the union type. To test this properly we may just need a test
harness to produce batches explicitly and feed them into an operator,
rather than relying on the JSON reader.

- Jason

On Tue, Dec 15, 2015 at 2:31 PM, Amit Hadke  wrote:

> Hey Guys,
>
> I'm not able to reproduce same issue and test doesn't seem to be doing
> anything.
>
> Can someone run "mvn -Dtest=TestTopNSchemaChanges#testMissingColumn test"
> and see if it fails?
>
> On Mon, Dec 14, 2015 at 11:51 PM, Amit Hadke  wrote:
>
> > This seems like  a bug in topn code than test.
> > We are expecting sorted by kl2 (descending) so that non null values come
> > up on top.
> > Results seems to be have nulls on top.
> >
> > ~ Amit.
> >
> > On Mon, Dec 14, 2015 at 11:27 PM, Jason Altekruse <
> > altekruseja...@gmail.com> wrote:
> >
> >> Seems weird that the results would be different based on reading order,
> as
> >> the queries themselves contain an order by. Do we return different types
> >> out of the sort depending on which schema we get first? Is this
> >> intentional?
> >>
> >> - Jason
> >>
> >> On Mon, Dec 14, 2015 at 6:06 PM, Steven Phillips 
> >> wrote:
> >>
> >> > I just did a build a linux box, and didn't see this failure. My guess
> is
> >> > that it fails depending on which order the files are read.
> >> >
> >> > On Mon, Dec 14, 2015 at 5:38 PM, Venki Korukanti <
> >> > venki.koruka...@gmail.com>
> >> > wrote:
> >> >
> >> > > Is anyone else seeing below failure on latest master? I am running
> it
> >> on
> >> > > Linux.
> >> > >
> >> > >
> >> > >
> >> >
> >>
> testMissingColumn(org.apache.drill.exec.physical.impl.TopN.TestTopNSchemaChanges)
> >> > >  Time elapsed: 2.537 sec  <<< ERROR!
> >> > > java.lang.Exception: unexpected null at position 0 column '`vl2`'
> >> should
> >> > > have been:  299
> >> > >
> >> > > Expected Records near verification failure:
> >> > > Record Number: 0 { `kl1` : null,`kl2` : 299,`vl2` : 299,`vl1` :
> >> > null,`vl` :
> >> > > null,`kl` : null, }
> >> > > Record Number: 1 { `kl1` : null,`kl2` : 298,`vl2` : 298,`vl1` :
> >> > null,`vl` :
> >> > > null,`kl` : null, }
> >> > > Record Number: 2 { `kl1` : null,`kl2` : 297,`vl2` : 297,`vl1` :
> >> > null,`vl` :
> >> > > null,`kl` : null, }
> >> > >
> >> > >
> >> > > Actual Records near verification failure:
> >> > > Record Number: 0 { `kl1` : null,`vl2` : null,`kl2` : null,`vl1` :
> >> > null,`vl`
> >> > > : 100.0,`kl` : 100.0, }
> >> > > Record Number: 1 { `kl1` : null,`vl2` : null,`kl2` : null,`vl1` :
> >> > null,`vl`
> >> > > : 101.0,`kl` : 101.0, }
> >> > > Record Number: 2 { `kl1` : null,`vl2` : null,`kl2` : null,`vl1` :
> >> > null,`vl`
> >> > > : 102.0,`kl` : 102.0, }
> >> > >
> >> > > For query: select kl, vl, kl1, vl1, kl2, vl2 from
> >> > >
> >> > >
> >> >
> >>
> dfs_test.`/root/drill/exec/java-exec/target/1450142361702-0/topn-schemachanges`
> >> > > order by kl2 desc limit 3
> >> > > at
> >> > >
> >> > >
> >> >
> >>
> org.apache.drill.DrillTestWrapper.compareValuesErrorOnMismatch(DrillTestWrapper.java:512)
> >> > > at
> >> > >
> >> > >
> >> >
> >>
> org.apache.drill.DrillTestWrapper.compareMergedVectors(DrillTestWrapper.java:170)
> >> > > at
> >> > >
> >> > >
> >> >
> >>
> org.apache.drill.DrillTestWrapper.compareMergedOnHeapVectors(DrillTestWrapper.java:397)
> >> > > at
> >> > >
> >> > >
> >> >
> >>
> org.apache.drill.DrillTestWrapper.compareOrderedResults(DrillTestWrapper.java:352)
> >> > > at
> >> > org.apache.drill.DrillTestWrapper.run(DrillTestWrapper.java:124)
> >> > > at org.apache.drill.TestBuilder.go(TestBuilder.java:129)
> >> > > at
> >> > >
> >> > >
> >> >
> >>
> org.apache.drill.exec.physical.impl.TopN.TestTopNSchemaChanges.testMissingColumn(TestTopNSchemaChanges.java:206)
> >> > >
> >> > >
> >> > > Results :
> >> > >
> >> > > Tests in error:
> >> > >   TestTopNSchemaChanges.testMissingColumn:206 »  unexpected null at
> >> > > position 0 c...
> >> > >
> >> > > Tests

[jira] [Resolved] (DRILL-3376) Reading individual files created by CTAS with partition causes an exception

2015-12-15 Thread Rahul Challapalli (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-3376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rahul Challapalli resolved DRILL-3376.
--
Resolution: Duplicate

> Reading individual files created by CTAS with partition causes an exception
> ---
>
> Key: DRILL-3376
> URL: https://issues.apache.org/jira/browse/DRILL-3376
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Writer
>Affects Versions: 1.1.0
>Reporter: Parth Chandra
>Assignee: Steven Phillips
> Fix For: 1.1.0
>
>
> Create a table using CTAS with partitioning:
> {code}
> create table `lineitem_part` partition by (l_moddate) as select l.*, 
> l_shipdate - extract(day from l_shipdate) + 1 l_moddate from 
> cp.`tpch/lineitem.parquet` l
> {code}
> Then the following query causes an exception
> {code}
> select distinct l_moddate from `lineitem_part/0_0_1.parquet` where l_moddate 
> = date '1992-01-01';
> {code}
> Trace in the log file - 
> {panel}
> Caused by: java.lang.StringIndexOutOfBoundsException: String index out of 
> range: 0
> at java.lang.String.charAt(String.java:658) ~[na:1.7.0_65]
> at 
> org.apache.drill.exec.planner.logical.partition.PruneScanRule$PathPartition.(PruneScanRule.java:493)
>  ~[drill-java-exec-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
> at 
> org.apache.drill.exec.planner.logical.partition.PruneScanRule.doOnMatch(PruneScanRule.java:385)
>  ~[drill-java-exec-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
> at 
> org.apache.drill.exec.planner.logical.partition.PruneScanRule$4.onMatch(PruneScanRule.java:278)
>  ~[drill-java-exec-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
> at 
> org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(VolcanoRuleCall.java:228)
>  ~[calcite-core-1.1.0-drill-r9.jar:1.1.0-drill-r9]
> ... 13 common frames omitted
> {panel}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Re: Unittest failure on master

2015-12-15 Thread Amit Hadke

Jason,

I misunderstood earlier why unit test is failing. It has nothing to do with
ordering of files.

Whats happening is I'm doing topn operation on field which is union of
strings and nulls in descending order.
Test checks if string values are on top but somehow for some people nulls
are on top and test fails.

I'm suspecting it has to do with how comparator treats null - high/low.

~ Amit.



On Tue, Dec 15, 2015 at 3:23 PM, Jason Altekruse 
wrote:

> Amit,
>
> The message out of the test framework tries to provide enough information
> to debug even if the issues isn't reproducible in your environment. Can you
> think of any reason why it might be giving the different results shown in
> the message if the order of the batches changed?
>
> If you need to change the order yourself there are two hacky approaches you
> could do. Try changing the names or saving the files in a different order
>  to make the FS give them back to you in a different order. You also could
> just combine together the files and adjust the batch cutoff number used in
> the json reader, with various ordering of the records in different versions
> of the dataset.
>
> As I write this I realize that combining the files will change the behavior
> of the read. with the first batch giving a single type and later ones
> giving a union type. As opposed to the multiple files approach which would
> produce a bunch of different individual types and make the sort operation
> generate the union type. To test this properly we may just need a test
> harness to produce batches explicitly and feed them into an operator,
> rather than relying on the JSON reader.
>
> - Jason
>
> On Tue, Dec 15, 2015 at 2:31 PM, Amit Hadke  wrote:
>
> > Hey Guys,
> >
> > I'm not able to reproduce same issue and test doesn't seem to be doing
> > anything.
> >
> > Can someone run "mvn -Dtest=TestTopNSchemaChanges#testMissingColumn test"
> > and see if it fails?
> >
> > On Mon, Dec 14, 2015 at 11:51 PM, Amit Hadke 
> wrote:
> >
> > > This seems like  a bug in topn code than test.
> > > We are expecting sorted by kl2 (descending) so that non null values
> come
> > > up on top.
> > > Results seems to be have nulls on top.
> > >
> > > ~ Amit.
> > >
> > > On Mon, Dec 14, 2015 at 11:27 PM, Jason Altekruse <
> > > altekruseja...@gmail.com> wrote:
> > >
> > >> Seems weird that the results would be different based on reading
> order,
> > as
> > >> the queries themselves contain an order by. Do we return different
> types
> > >> out of the sort depending on which schema we get first? Is this
> > >> intentional?
> > >>
> > >> - Jason
> > >>
> > >> On Mon, Dec 14, 2015 at 6:06 PM, Steven Phillips 
> > >> wrote:
> > >>
> > >> > I just did a build a linux box, and didn't see this failure. My
> guess
> > is
> > >> > that it fails depending on which order the files are read.
> > >> >
> > >> > On Mon, Dec 14, 2015 at 5:38 PM, Venki Korukanti <
> > >> > venki.koruka...@gmail.com>
> > >> > wrote:
> > >> >
> > >> > > Is anyone else seeing below failure on latest master? I am running
> > it
> > >> on
> > >> > > Linux.
> > >> > >
> > >> > >
> > >> > >
> > >> >
> > >>
> >
> testMissingColumn(org.apache.drill.exec.physical.impl.TopN.TestTopNSchemaChanges)
> > >> > >  Time elapsed: 2.537 sec  <<< ERROR!
> > >> > > java.lang.Exception: unexpected null at position 0 column '`vl2`'
> > >> should
> > >> > > have been:  299
> > >> > >
> > >> > > Expected Records near verification failure:
> > >> > > Record Number: 0 { `kl1` : null,`kl2` : 299,`vl2` : 299,`vl1` :
> > >> > null,`vl` :
> > >> > > null,`kl` : null, }
> > >> > > Record Number: 1 { `kl1` : null,`kl2` : 298,`vl2` : 298,`vl1` :
> > >> > null,`vl` :
> > >> > > null,`kl` : null, }
> > >> > > Record Number: 2 { `kl1` : null,`kl2` : 297,`vl2` : 297,`vl1` :
> > >> > null,`vl` :
> > >> > > null,`kl` : null, }
> > >> > >
> > >> > >
> > >> > > Actual Records near verification failure:
> > >> > > Record Number: 0 { `kl1` : null,`vl2` : null,`kl2` : null,`vl1` :
> > >> > null,`vl`
> > >> > > : 100.0,`kl` : 100.0, }
> > >> > > Record Number: 1 { `kl1` : null,`vl2` : null,`kl2` : null,`vl1` :
> > >> > null,`vl`
> > >> > > : 101.0,`kl` : 101.0, }
> > >> > > Record Number: 2 { `kl1` : null,`vl2` : null,`kl2` : null,`vl1` :
> > >> > null,`vl`
> > >> > > : 102.0,`kl` : 102.0, }
> > >> > >
> > >> > > For query: select kl, vl, kl1, vl1, kl2, vl2 from
> > >> > >
> > >> > >
> > >> >
> > >>
> >
> dfs_test.`/root/drill/exec/java-exec/target/1450142361702-0/topn-schemachanges`
> > >> > > order by kl2 desc limit 3
> > >> > > at
> > >> > >
> > >> > >
> > >> >
> > >>
> >
> org.apache.drill.DrillTestWrapper.compareValuesErrorOnMismatch(DrillTestWrapper.java:512)
> > >> > > at
> > >> > >
> > >> > >
> > >> >
> > >>
> >
> org.apache.drill.DrillTestWrapper.compareMergedVectors(DrillTestWrapper.java:170)
> > >> > > at
> > >> > >
> > >> > >
> > >> >
> > >>
> >

[jira] [Resolved] (DRILL-4169) Upgrade Hive Storage Plugin to work with latest stable Hive (v1.2.1)

2015-12-15 Thread Venki Korukanti (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-4169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Venki Korukanti resolved DRILL-4169.

Resolution: Fixed

> Upgrade Hive Storage Plugin to work with latest stable Hive (v1.2.1)
> 
>
> Key: DRILL-4169
> URL: https://issues.apache.org/jira/browse/DRILL-4169
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - Hive
>Affects Versions: 1.4.0
>Reporter: Venki Korukanti
>Assignee: Venki Korukanti
> Fix For: 1.5.0
>
>
> There have been few bug fixes in Hive SerDes since Hive 1.0.0. Its good to 
> update the Hive storage plugin to work with latest stable Hive version 
> (1.2.1), so that HiveRecordReader can use the latest SerDes.
> Compatibility when working with lower versions (v1.0.0 - currently supported 
> version) of Hive servers: There are no metastore API changes between Hive 
> 1.0.0 and Hive 1.2.1 that affect how Drill's Hive storage plugin is 
> interacting with Hive metastore. Tested to make sure it works fine. So users 
> can use Drill to query Hive 1.0.0 (currently supported) and Hive 1.2.1 (new 
> addition in this JIRA).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (DRILL-4203) Parquet File : Date is stored wrongly

2015-12-15 Thread JIRA

Stéphane Trou created DRILL-4203:


 Summary: Parquet File : Date is stored wrongly
 Key: DRILL-4203
 URL: https://issues.apache.org/jira/browse/DRILL-4203
 Project: Apache Drill
  Issue Type: Bug
Affects Versions: 1.4.0
Reporter: Stéphane Trou


Hello,

I have some problems when i try to read parquet files produce by drill with  
Spark,  all dates are corrupted.

I think the problem come from drill :)

{code}
cat /tmp/date_parquet.csv 
Epoch,1970-01-01
{code}

{code}
0: jdbc:drill:zk=local> select columns[0] as name, cast(columns[1] as date) as 
epoch_date from dfs.tmp.`date_parquet.csv`;
++-+
|  name  | epoch_date  |
++-+
| Epoch  | 1970-01-01  |
++-+
{code}

{code}
0: jdbc:drill:zk=local> create table dfs.tmp.`buggy_parquet`as select 
columns[0] as name, cast(columns[1] as date) as epoch_date from 
dfs.tmp.`date_parquet.csv`;
+---++
| Fragment  | Number of records written  |
+---++
| 0_0   | 1  |
+---++
{code}

When I read the file with parquet tools, i found  
{code}
java -jar parquet-tools-1.8.1.jar head /tmp/buggy_parquet/
name = Epoch
epoch_date = 4881176
{code}

According to 
[https://github.com/Parquet/parquet-format/blob/master/LogicalTypes.md#date], 
epoch_date should be equals to 0.

Meta : 
{code}
java -jar parquet-tools-1.8.1.jar meta /tmp/buggy_parquet/
file:file:/tmp/buggy_parquet/0_0_0.parquet 
creator: parquet-mr version 1.8.1-drill-r0 (build 
6b605a4ea05b66e1a6bf843353abcb4834a4ced8) 
extra:   drill.version = 1.4.0 

file schema: root 

name:OPTIONAL BINARY O:UTF8 R:0 D:1
epoch_date:  OPTIONAL INT32 O:DATE R:0 D:1

row group 1: RC:1 TS:93 OFFSET:4 

name: BINARY SNAPPY DO:0 FPO:4 SZ:52/50/0,96 VC:1 
ENC:RLE,BIT_PACKED,PLAIN
epoch_date:   INT32 SNAPPY DO:0 FPO:56 SZ:45/43/0,96 VC:1 
ENC:RLE,BIT_PACKED,PLAIN
{code}





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (DRILL-4204) typeof function throws system error when input parameter is a literal value

2015-12-15 Thread Victoria Markman (JIRA)

Victoria Markman created DRILL-4204:
---

 Summary: typeof function throws system error when input parameter 
is a literal value
 Key: DRILL-4204
 URL: https://issues.apache.org/jira/browse/DRILL-4204
 Project: Apache Drill
  Issue Type: Bug
  Components: Functions - Drill
Affects Versions: 1.4.0
Reporter: Victoria Markman
Priority: Minor


{code}
0: jdbc:drill:schema=dfs> select typeof(1) from sys.options limit 1;
Error: SYSTEM ERROR: IllegalArgumentException: Can not set 
org.apache.drill.exec.vector.complex.reader.FieldReader field 
org.apache.drill.exec.expr.fn.impl.UnionFunctions$GetType.input to 
org.apache.drill.exec.expr.holders.IntHolder
[Error Id: 2139649a-b6f4-48b8-9a25-c0cb78072524 on atsqa4-134.qa.lab:31010] 
(state=,code=0)

0: jdbc:drill:schema=dfs> select typeof('1') from sys.options limit 1;
Error: SYSTEM ERROR: IllegalArgumentException: Can not set 
org.apache.drill.exec.vector.complex.reader.FieldReader field 
org.apache.drill.exec.expr.fn.impl.UnionFunctions$GetType.input to 
org.apache.drill.exec.expr.holders.VarCharHolder
[Error Id: 4f3b9fbd-6ad4-4d0d-ad7b-e22778a6bcb9 on atsqa4-134.qa.lab:31010] 
(state=,code=0)
{code}

drillbit.log
{code}
2015-12-15 23:57:34,323 [298f5712-077d-21bc-49ec-ebc2aca5acce:foreman] ERROR 
o.a.drill.exec.work.foreman.Foreman - SYSTEM ERROR: IllegalArgumentException: 
Can not set org.apache.drill.exec.vector.complex.reader.FieldReader field 
org.apache.drill.exec.expr.fn.impl.UnionFunctions$GetType.input to 
org.apache.drill.exec.expr.holders.IntHolder


[Error Id: 2139649a-b6f4-48b8-9a25-c0cb78072524 on atsqa4-134.qa.lab:31010]
org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: 
IllegalArgumentException: Can not set 
org.apache.drill.exec.vector.complex.reader.FieldReader field 
org.apache.drill.exec.expr.fn.impl.UnionFunctions$GetType.input to 
org.apache.drill.exec.expr.holders.IntHolder


[Error Id: 2139649a-b6f4-48b8-9a25-c0cb78072524 on atsqa4-134.qa.lab:31010]
at 
org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:534)
 ~[drill-common-1.4.0-SNAPSHOT.jar:1.4.0-SNAPSHOT]
at 
org.apache.drill.exec.work.foreman.Foreman$ForemanResult.close(Foreman.java:742)
 [drill-java-exec-1.4.0-SNAPSHOT.jar:1.4.0-SNAPSHOT]
at 
org.apache.drill.exec.work.foreman.Foreman$StateSwitch.processEvent(Foreman.java:841)
 [drill-java-exec-1.4.0-SNAPSHOT.jar:1.4.0-SNAPSHOT]
at 
org.apache.drill.exec.work.foreman.Foreman$StateSwitch.processEvent(Foreman.java:786)
 [drill-java-exec-1.4.0-SNAPSHOT.jar:1.4.0-SNAPSHOT]
at 
org.apache.drill.common.EventProcessor.sendEvent(EventProcessor.java:73) 
[drill-common-1.4.0-SNAPSHOT.jar:1.4.0-SNAPSHOT]
at 
org.apache.drill.exec.work.foreman.Foreman$StateSwitch.moveToState(Foreman.java:788)
 [drill-java-exec-1.4.0-SNAPSHOT.jar:1.4.0-SNAPSHOT]
at 
org.apache.drill.exec.work.foreman.Foreman.moveToState(Foreman.java:894) 
[drill-java-exec-1.4.0-SNAPSHOT.jar:1.4.0-SNAPSHOT]
at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:255) 
[drill-java-exec-1.4.0-SNAPSHOT.jar:1.4.0-SNAPSHOT]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 
[na:1.7.0_71]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 
[na:1.7.0_71]
at java.lang.Thread.run(Thread.java:745) [na:1.7.0_71]
Caused by: org.apache.drill.exec.work.foreman.ForemanException: Unexpected 
exception during fragment initialization: Internal error: Error while applying 
rule ReduceExpressionsRule_Project, args 
[rel#4401:LogicalProject.NONE.ANY([]).[](input=rel#4400:Subset#0.ENUMERABLE.ANY([]).[],EXPR$0=TYPEOF(1))]
... 4 common frames omitted
Caused by: java.lang.AssertionError: Internal error: Error while applying rule 
ReduceExpressionsRule_Project, args 
[rel#4401:LogicalProject.NONE.ANY([]).[](input=rel#4400:Subset#0.ENUMERABLE.ANY([]).[],EXPR$0=TYPEOF(1))]
at org.apache.calcite.util.Util.newInternal(Util.java:792) 
~[calcite-core-1.4.0-drill-r10.jar:1.4.0-drill-r10]
at 
org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(VolcanoRuleCall.java:251)
 ~[calcite-core-1.4.0-drill-r10.jar:1.4.0-drill-r10]
at 
org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp(VolcanoPlanner.java:808)
 ~[calcite-core-1.4.0-drill-r10.jar:1.4.0-drill-r10]
at 
org.apache.calcite.tools.Programs$RuleSetProgram.run(Programs.java:303) 
~[calcite-core-1.4.0-drill-r10.jar:1.4.0-drill-r10]
at 
org.apache.calcite.prepare.PlannerImpl.transform(PlannerImpl.java:313) 
~[calcite-core-1.4.0-drill-r10.jar:1.4.0-drill-r10]
at 
org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.doLogicalPlanning(DefaultSqlHandler.java:542)
 ~[drill-java-exec-1.4.0-SNAPSHOT.jar:1.4.0-SNAPSHOT]
at

[GitHub] drill pull request: DRILL-4169: Upgrade Hive storage plugin to wor...

2015-12-15 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/drill/pull/302


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Resolved] (DRILL-4198) Enhance StoragePlugin interface to expose logical space rules for planning purpose

2015-12-15 Thread Venki Korukanti (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-4198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Venki Korukanti resolved DRILL-4198.

Resolution: Fixed

> Enhance StoragePlugin interface to expose logical space rules for planning 
> purpose
> --
>
> Key: DRILL-4198
> URL: https://issues.apache.org/jira/browse/DRILL-4198
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: Venki Korukanti
>Assignee: Venki Korukanti
>
> Currently StoragePlugins can only expose rules that are executed in physical 
> space. Add an interface method to StoragePlugin to expose logical space rules 
> to planner.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (DRILL-4194) Improve the performance of metadata fetch operation in HiveScan

2015-12-15 Thread Venki Korukanti (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-4194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Venki Korukanti resolved DRILL-4194.

Resolution: Fixed

> Improve the performance of metadata fetch operation in HiveScan
> ---
>
> Key: DRILL-4194
> URL: https://issues.apache.org/jira/browse/DRILL-4194
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Hive
>Affects Versions: 1.4.0
>Reporter: Venki Korukanti
>Assignee: Venki Korukanti
> Fix For: 1.5.0
>
>
> Current HiveScan fetches the InputSplits for all partitions when {{HiveScan}} 
> is created. This causes long delays when the table contains large number of 
> partitions. If we end up pruning majority of partitions, this delay is 
> unnecessary.
> We need this InputSplits info from the beginning of planning because
>  * it is used in calculating the cost of the {{HiveScan}}. Currently when 
> calculating the cost first we look at the rowCount (from Hive MetaStore), if 
> it is available we use it in cost calculation. Otherwise we estimate the 
> rowCount from InputSplits. 
>  * We also need the InputSplits for determining whether {{HiveScan}} is a 
> singleton or distributed for adding appropriate traits in {{ScanPrule}}
> Fix is to delay the loading of the InputSplits until we need. There are two 
> cases where we need it. If we end up fetching the InputSplits, store them 
> until the query completes.
>  * If the stats are not available, then we need InputSplits
>  * If the partition is not pruned we need it for parallelization purposes.
> Regarding getting the parallelization info in {{ScanPrule}}: Had a discussion 
> with [~amansinha100]. All we need at this point is whether the data is 
> distributed or singleton at this point. Added a method {{isSingleton()}} to 
> GroupScan. Returning {{false}} seems to work fine for HiveScan, but I am not 
> sure of the implications here. We also have {{ExcessiveExchangeIdentifier}} 
> which removes unnecessary exchanges by looking at the parallelization info. I 
> think it is ok to return the parallelization info here as the pruning must 
> have already completed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Re: Unittest failure on master

2015-12-15 Thread Steven Phillips

To clarify, I was suggesting that the order in which the files are read
could be causing the variation in result, not that this is expected. There
definitely seems to be a bug. But the fact that it passes sometimes and not
others suggests the problem is exposed by file ordering.

On Tue, Dec 15, 2015 at 3:31 PM, Amit Hadke  wrote:

> Jason,
>
> I misunderstood earlier why unit test is failing. It has nothing to do with
> ordering of files.
>
> Whats happening is I'm doing topn operation on field which is union of
> strings and nulls in descending order.
> Test checks if string values are on top but somehow for some people nulls
> are on top and test fails.
>
> I'm suspecting it has to do with how comparator treats null - high/low.
>
> ~ Amit.
>
>
>
> On Tue, Dec 15, 2015 at 3:23 PM, Jason Altekruse  >
> wrote:
>
> > Amit,
> >
> > The message out of the test framework tries to provide enough information
> > to debug even if the issues isn't reproducible in your environment. Can
> you
> > think of any reason why it might be giving the different results shown in
> > the message if the order of the batches changed?
> >
> > If you need to change the order yourself there are two hacky approaches
> you
> > could do. Try changing the names or saving the files in a different order
> >  to make the FS give them back to you in a different order. You also
> could
> > just combine together the files and adjust the batch cutoff number used
> in
> > the json reader, with various ordering of the records in different
> versions
> > of the dataset.
> >
> > As I write this I realize that combining the files will change the
> behavior
> > of the read. with the first batch giving a single type and later ones
> > giving a union type. As opposed to the multiple files approach which
> would
> > produce a bunch of different individual types and make the sort operation
> > generate the union type. To test this properly we may just need a test
> > harness to produce batches explicitly and feed them into an operator,
> > rather than relying on the JSON reader.
> >
> > - Jason
> >
> > On Tue, Dec 15, 2015 at 2:31 PM, Amit Hadke 
> wrote:
> >
> > > Hey Guys,
> > >
> > > I'm not able to reproduce same issue and test doesn't seem to be doing
> > > anything.
> > >
> > > Can someone run "mvn -Dtest=TestTopNSchemaChanges#testMissingColumn
> test"
> > > and see if it fails?
> > >
> > > On Mon, Dec 14, 2015 at 11:51 PM, Amit Hadke 
> > wrote:
> > >
> > > > This seems like  a bug in topn code than test.
> > > > We are expecting sorted by kl2 (descending) so that non null values
> > come
> > > > up on top.
> > > > Results seems to be have nulls on top.
> > > >
> > > > ~ Amit.
> > > >
> > > > On Mon, Dec 14, 2015 at 11:27 PM, Jason Altekruse <
> > > > altekruseja...@gmail.com> wrote:
> > > >
> > > >> Seems weird that the results would be different based on reading
> > order,
> > > as
> > > >> the queries themselves contain an order by. Do we return different
> > types
> > > >> out of the sort depending on which schema we get first? Is this
> > > >> intentional?
> > > >>
> > > >> - Jason
> > > >>
> > > >> On Mon, Dec 14, 2015 at 6:06 PM, Steven Phillips  >
> > > >> wrote:
> > > >>
> > > >> > I just did a build a linux box, and didn't see this failure. My
> > guess
> > > is
> > > >> > that it fails depending on which order the files are read.
> > > >> >
> > > >> > On Mon, Dec 14, 2015 at 5:38 PM, Venki Korukanti <
> > > >> > venki.koruka...@gmail.com>
> > > >> > wrote:
> > > >> >
> > > >> > > Is anyone else seeing below failure on latest master? I am
> running
> > > it
> > > >> on
> > > >> > > Linux.
> > > >> > >
> > > >> > >
> > > >> > >
> > > >> >
> > > >>
> > >
> >
> testMissingColumn(org.apache.drill.exec.physical.impl.TopN.TestTopNSchemaChanges)
> > > >> > >  Time elapsed: 2.537 sec  <<< ERROR!
> > > >> > > java.lang.Exception: unexpected null at position 0 column
> '`vl2`'
> > > >> should
> > > >> > > have been:  299
> > > >> > >
> > > >> > > Expected Records near verification failure:
> > > >> > > Record Number: 0 { `kl1` : null,`kl2` : 299,`vl2` : 299,`vl1` :
> > > >> > null,`vl` :
> > > >> > > null,`kl` : null, }
> > > >> > > Record Number: 1 { `kl1` : null,`kl2` : 298,`vl2` : 298,`vl1` :
> > > >> > null,`vl` :
> > > >> > > null,`kl` : null, }
> > > >> > > Record Number: 2 { `kl1` : null,`kl2` : 297,`vl2` : 297,`vl1` :
> > > >> > null,`vl` :
> > > >> > > null,`kl` : null, }
> > > >> > >
> > > >> > >
> > > >> > > Actual Records near verification failure:
> > > >> > > Record Number: 0 { `kl1` : null,`vl2` : null,`kl2` : null,`vl1`
> :
> > > >> > null,`vl`
> > > >> > > : 100.0,`kl` : 100.0, }
> > > >> > > Record Number: 1 { `kl1` : null,`vl2` : null,`kl2` : null,`vl1`
> :
> > > >> > null,`vl`
> > > >> > > : 101.0,`kl` : 101.0, }
> > > >> > > Record Number: 2 { `kl1` : null,`vl2` : null,`kl2` : null,`vl1`
> :
> > > >> > null,`vl`

[GitHub] drill pull request: DRILL-4169: Upgrade Hive storage plugin to wor...

2015-12-15 Thread vkorukanti

GitHub user vkorukanti opened a pull request:

https://github.com/apache/drill/pull/302

DRILL-4169: Upgrade Hive storage plugin to work with Hive 1.2.1

+ HadoopShims.setTokenStr is moved to Utils.setTokenStr. There is no change
  in functionality.
+ Disable binary partitions columns in Hive test suites. Binary
  partition column feature is regressed in Hive 1.2.1. This should affect
  only the Hive execution which is used to generate the test data. If Drill
  is talking to Hive v1.0.0 (which has binary partition columns working),
  Drill should be able to get the data from Hive without any issues (tested)
+ Update StorageHandler based test as there is an issue with test data
  generation in Hive. Need a separate test with custom test StorageHandler.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/vkorukanti/drill hive121

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/302.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #302


commit 1579d40641a8731b9478233d252349a7bf7166c5
Author: vkorukanti 
Date:   2015-12-11T19:36:11Z

DRILL-4194: Improve performance of the HiveScan metadata fetch operation

+ Use the stats (numRows) stored in Hive metastore whenever available to
  calculate the costs for planning purpose
+ Delay the costly operation of loading of InputSplits until needed. When
  InputSplits are loaded, cache them at query level to speedup subsequent
  access.

this closes #301

commit ff555e63218038c5dddc5a4eecea7faf8cff058c
Author: vkorukanti 
Date:   2015-08-26T00:51:19Z

DRILL-4169: Upgrade Hive storage plugin to work with Hive 1.2.1

+ HadoopShims.setTokenStr is moved to Utils.setTokenStr. There is no change
  in functionality.
+ Disable binary partitions columns in Hive test suites. Binary
  partition column feature is regressed in Hive 1.2.1. This should affect
  only the Hive execution which is used to generate the test data. If Drill
  is talking to Hive v1.0.0 (which has binary partition columns working),
  Drill should be able to get the data from Hive without any issues (tested)
+ Update StorageHandler based test as there is an issue with test data
  generation in Hive. Need a separate test with custom test StorageHandler.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Created] (DRILL-4200) drill-jdbc-storage: applies timezone to java.sql.Date field and fails

2015-12-15 Thread Karol Potocki (JIRA)

Karol Potocki created DRILL-4200:


 Summary: drill-jdbc-storage: applies timezone to java.sql.Date 
field and fails
 Key: DRILL-4200
 URL: https://issues.apache.org/jira/browse/DRILL-4200
 Project: Apache Drill
  Issue Type: Bug
  Components: Storage - Other
Affects Versions: 1.3.0
 Environment: drill-jdbc-storage plugin configured (based on 
https://drill.apache.org/docs/rdbms-storage-plugin) with 
org.relique.jdbc.csv.CsvDriver to access dbf (dbase) files.
Reporter: Karol Potocki


When using org.relique.jdbc.csv.CsvDriver to query files with date fields (i.e. 
2012-05-01) causes:

{code}
UnsupportedOperationException: Method not supported: ResultSet.getDate(int, 
Calendar)
{code}

In JdbcRecordReader.java:406  there is getDate which tries to apply timezone to 
java.sql.Date which probably is not timezone related and this brings the error.

Quick fix is to use ResultSet.getDate(int) instead.

Details:
{code}
Caused by: java.lang.UnsupportedOperationException: Method not supported: Result
Set.getDate(int, Calendar)
at org.relique.jdbc.csv.CsvResultSet.getDate(Unknown Source) ~[csvjdbc-1
.0-28.jar:na]
at org.apache.commons.dbcp.DelegatingResultSet.getDate(DelegatingResultS
et.java:574) ~[commons-dbcp-1.4.jar:1.4]
at org.apache.commons.dbcp.DelegatingResultSet.getDate(DelegatingResultS
et.java:574) ~[commons-dbcp-1.4.jar:1.4]
at org.apache.drill.exec.store.jdbc.JdbcRecordReader$DateCopier.copy(Jdb
cRecordReader.java:406) ~[drill-jdbc-storage-1.4.0-SNAPSHOT.jar:1.4.0-SNAPSHOT]
at org.apache.drill.exec.store.jdbc.JdbcRecordReader.next(JdbcRecordRead
er.java:242) ~[drill-jdbc-storage-1.4.0-SNAPSHOT.jar:1.4.0-SNAPSHOT]
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (DRILL-4201) DrillPushFilterPastProject should allow partial filter pushdown.

2015-12-15 Thread Jinfeng Ni (JIRA)

Jinfeng Ni created DRILL-4201:
-

 Summary: DrillPushFilterPastProject should allow partial filter 
pushdown. 
 Key: DRILL-4201
 URL: https://issues.apache.org/jira/browse/DRILL-4201
 Project: Apache Drill
  Issue Type: Bug
  Components: Query Planning & Optimization
Reporter: Jinfeng Ni
Assignee: Jinfeng Ni
 Fix For: 1.5.0


Currently, DrillPushFilterPastProjectRule will stop pushing the filter down, if 
the filter itself has ITEM or FLATTEN function, or its input reference is 
referring to an ITEM or FLATTEN function. However, in case that the filter is a 
conjunction of multiple sub-filters, some of them refer to ITEM  or FLATTEN but 
the other not, then we should allow partial filter to be pushed down. For 
instance,

WHERE  partition_col > 10 and flatten_output_col = 'ABC'. 

The "flatten_output_col" comes from the output of FLATTEN operator, and 
therefore flatten_output_col = 'ABC' should not pushed past the project. But 
partiion_col > 10 should be pushed down, such that we could trigger the pruning 
rule to apply partition pruning.

It would be improve Drill query performance, when the partially pushed filter 
leads to partition pruning, or the partially pushed filter results in early 
filtering in upstream operator. 






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Re: Naming the new ValueVector Initiative

2015-12-15 Thread Jacques Nadeau

Thanks Wes, that's great!
On Dec 14, 2015 9:44 AM, "Wes McKinney"  wrote:

> hi folks,
>
> In the interim I created a new public GitHub organization to host code
> for this effort so we can organize ourselves in advance of more
> progress in the ASF:
>
> https://github.com/arrow-data
>
> I have a partial C++ implementation of the Arrow spec that I can move
> there, along with a to-be-Markdown-ified version of a specification
> subject to more iteration. The more pressing short term matter will be
> making some progress on the metadata / data headers / IPC protocol
> (e.g. using Flatbuffers or the like).
>
> Thoughts on git repo structure?
>
> 1) Avro-style — "one repo to rule them all"
> 2) Parquet-style — arrow-format, arrow-cpp, arrow-java, etc.
>
> (I'm personally more in the latter camp, though integration tests may
> be more tedious that way)
>
> Thanks
>
> On Thu, Dec 3, 2015 at 4:18 PM, Jacques Nadeau  wrote:
> > I've opened a name search for our top vote getter.
> >
> > https://issues.apache.org/jira/browse/PODLINGNAMESEARCH-92
> >
> >
> > I also just realized that my previously email dropped other recipients.
> > Here it is below.
> >
> > 
> > I think we can call the voting closed. Top vote getters:
> >
> > Apache Arrow (17)
> > Apache Herringbone (9)
> > Apache Joist (8)
> > Apache Colbuf (8)
> >
> > I'll up a PODLINGNAMESEARCH-* shortly for Arrow.
> >
> > ---
> >
> >
> >
> >
> >
> >
> > --
> > Jacques Nadeau
> > CTO and Co-Founder, Dremio
> >
> > On Thu, Dec 3, 2015 at 1:23 AM, Marcel Kornacker 
> > wrote:
> >>
> >> Just added my vote.
> >>
> >> On Thu, Dec 3, 2015 at 12:51 PM, Wes McKinney  wrote:
> >> > Shall we call the voting closed? Any last stragglers?
> >> >
> >> > On Tue, Dec 1, 2015 at 5:39 PM, Ted Dunning 
> >> > wrote:
> >> >>
> >> >> Apache can handle this if we set the groundwork in place.
> >> >>
> >> >> Also, Twitter's lawyers work for Twitter, not for Apache. As such,
> >> >> their
> >> >> opinions can't be taken by Apache as legal advice.  There are issues
> of
> >> >> privilege, conflict of interest and so on.
> >> >>
> >> >>
> >> >>
> >> >> On Wed, Dec 2, 2015 at 7:51 AM, Alex Levenson
> >> >> 
> >> >> wrote:
> >> >>>
> >> >>> I can ask about whether Twitter's lawyers can help out -- is that
> >> >>> something we need? Or is that something apache helps out with in the
> >> >>> next
> >> >>> step?
> >> >>>
> >> >>> On Mon, Nov 30, 2015 at 9:32 PM, Julian Hyde 
> wrote:
> >> 
> >>  +1 to have a vote tomorrow.
> >> 
> >>  Assuming that Vector is out of play, I just did a quick search for
> >>  the
> >>  top 4 remaining, (“arrow”, “honeycomb”, “herringbone”, “joist"), at
> >>  sourceforge, open hub, trademarkia, and on google. There are no
> >>  trademarks
> >>  for these in similar subject areas. There is a moderately active
> >>  project
> >>  called “joist” [1].
> >> 
> >>  I will point out that “Apache Arrow” has native-american
> connotations
> >>  that we may or may not want to live with (just ask the Washington
> >>  Redskins
> >>  how they feel about their name).
> >> 
> >>  If someone would like to vet other names, use the links on
> >>  https://issues.apache.org/jira/browse/PODLINGNAMESEARCH-90, and
> fill
> >>  out
> >>  column C in the spreadsheet.
> >> 
> >>  Julian
> >> 
> >>  [1] https://github.com/stephenh/joist
> >> 
> >> 
> >>  On Nov 30, 2015, at 7:01 PM, Jacques Nadeau 
> >>  wrote:
> >> 
> >>  +1
> >> 
> >>  --
> >>  Jacques Nadeau
> >>  CTO and Co-Founder, Dremio
> >> 
> >>  On Mon, Nov 30, 2015 at 6:34 PM, Wes McKinney 
> >>  wrote:
> >> 
> >>  Should we have a last call for votes, closing EOD tomorrow
> (Tuesday)?
> >>  I
> >>  missed this for a few days last week with holiday travel.
> >> 
> >>  On Thu, Nov 26, 2015 at 3:04 PM, Julian Hyde <
> jul...@hydromatic.net>
> >>  wrote:
> >> 
> >>  Consulting a lawyer is part of the Apache branding process but the
> >>  first
> >>  stage is to gather a list of potential conflicts -
> >>  https://issues.apache.org/jira/browse/PODLINGNAMESEARCH-90 is an
> >>  example.
> >> 
> >>  The other part, frankly, is to pick your battles.
> >> 
> >>  A year or so ago Actian re-branded Vectorwise as Vector.
> >> 
> >> 
> >> 
> http://www.zdnet.com/article/actian-consolidates-its-analytics-portfolio/.
> >>  Given that it is an analytic database in the Hadoop space I think
> >>  that is
> >>  as close to a “direct hit” as it gets. I don’t think we need a
> lawyer
> >>  to
> >>  tell us that. Certainly it makes sense to look for conflicts for
> the
> >>  other
> >>  alternatives before

[GitHub] drill pull request: Blog post for Drill 1.4 release

2015-12-15 Thread nategri

GitHub user nategri opened a pull request:

https://github.com/apache/drill/pull/303

Blog post for Drill 1.4 release

Blog post detailing some highlights for of the 1.4 release of Drill.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/nategri/drill gh-pages

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/303.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #303


commit 0fbf5c88029de2131f14aa8e92f62013718fabbe
Author: Nathan Griffith 
Date:   2015-12-15T22:35:30Z

blog post for Drill 1.4 release

commit 1392a0580bc33df8d26e488f5af1ec7c38264140
Author: Nathan Griffith 
Date:   2015-12-16T00:16:32Z

improvements to Drill 1.4 announcement blog post




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

Re: PLEASE VOTE NOW: proposals for Hadoop Summit EMEA

Hangout Starting

Drill with HA hadoop

[GitHub] drill pull request: DRILL-4194: Improve the performance of metadat...

Re: Question about the RecordIterator

Re: Question about the RecordIterator

Re: Question about the RecordIterator

Re: Question about the RecordIterator

Re: Question about the RecordIterator

Question about the RecordIterator

Re: Question about the RecordIterator

PLEASE VOTE NOW: proposals for Hadoop Summit EMEA

[jira] [Created] (DRILL-4199) Add Support for HBase 1.X

Re: Naming the new ValueVector Initiative

Re: Unittest failure on master

[GitHub] drill pull request: Add doc for Select with options

Re: Unittest failure on master

[jira] [Resolved] (DRILL-3376) Reading individual files created by CTAS with partition causes an exception

Re: Unittest failure on master

[jira] [Resolved] (DRILL-4169) Upgrade Hive Storage Plugin to work with latest stable Hive (v1.2.1)

[jira] [Created] (DRILL-4203) Parquet File : Date is stored wrongly

[jira] [Created] (DRILL-4204) typeof function throws system error when input parameter is a literal value

[GitHub] drill pull request: DRILL-4169: Upgrade Hive storage plugin to wor...

[jira] [Resolved] (DRILL-4198) Enhance StoragePlugin interface to expose logical space rules for planning purpose

[jira] [Resolved] (DRILL-4194) Improve the performance of metadata fetch operation in HiveScan

Re: Unittest failure on master

[GitHub] drill pull request: DRILL-4169: Upgrade Hive storage plugin to wor...

[jira] [Created] (DRILL-4200) drill-jdbc-storage: applies timezone to java.sql.Date field and fails

[jira] [Created] (DRILL-4201) DrillPushFilterPastProject should allow partial filter pushdown.

Re: Naming the new ValueVector Initiative

[GitHub] drill pull request: Blog post for Drill 1.4 release

31 matches

Site Navigation

Mail list logo

Footer information