Re: Hangout starting in 5 minutes

2016-04-05 Thread Jason Altekruse
No need for apologies John, today was pretty lively for discussion. These
are the notes I took.

---
community hangout 4/5/2016
---
Attendees: Jason, Arina, Vitalli, Stefan, Pawan, Parth, Jingeng, Sudheesh,
Aman

Pawan is new - might not have had a working mic, he didn't give an
introduction
- if you see this feel free to respond to the thread with more info
  about yourself and you interest in Drill

Topics
- Aman
- Metadata cache file
- proposal to create a separate file with just directory info
- should we just put info in a small databse like sqllite?
- 4530
- Newlines in CSV files
- 3178
- Jason - to do this we need to turn off splittability
- Aman, we can provide an option for users that want it, and they
can choose if losing splitabillity is worth it for them
- Jason - this should be a format/select with options setting, not
a session one
- Parth - or dotdrill
  - Julien was working on this? Proposal was given on a JIRA a
while back
  - has not updated it lately
- generally the other option is more flexible, an admin can set it
in the storage plugin or view, or a user can put it in a query themselves
  - dotdrill would require write permissions on the filesystem
  - still useful for other cases, collocating metadata with
data, but not necessarily needed for this case initially
- Stefan
- Wanted to apologize to Jacques for the thread last week
- We will continue the discussion on the list about the best way
forward for the Avro plugin
- Vitalli
- questions about PRs
- review the PR for spill directories, he has updated it based on the
comments
- Arina
- JIRA for viewing logs in Web UI
- originally logs only on the current node
- getting remote logs
- HTTP rest call
- Custom RPC tunnel?
- implement distributed read of files on each of the remote systems
- Parth
- release schedule? Jacques withdrew his offer for managing 1.7 to
focus on the new 2.0 branch
- Jinfeng
- partition pruning enhancement
 - remove redundant filters, we re-evaluate predicates on parent
directories over and over
- test framework complaining about ordering of files
- Jinfeng will make a proposal on the list about how to fix this in the
test framework level
- Different time?
- This lands pretty late for the folks in the Ukraine
- they said it was okay, but anyone who might not be attending due to
the
  time the meeting happens please speak up and we can look at moving it
or scheduling
  it at different time every other week or something to make sure
everyone is included

Jason Altekruse
Software Engineer at Dremio
Apache Drill Committer

On Tue, Apr 5, 2016 at 12:32 PM, John Omernik  wrote:

> Sorry I missed this, anything exciting happen?
>
> On Tue, Apr 5, 2016 at 11:57 AM, Jason Altekruse  wrote:
>
> > Anyone with an interest in Drill is welcome to attend to hear what is
> > happening in the Drill community. Feel free to ask questions or just
> listen
> > in.
> >
> > https://plus.google.com/hangouts/_/event/ci4rdiju8bv04a64efj5fedd0lc
> >
> > Jason Altekruse
> > Software Engineer at Dremio
> > Apache Drill Committer
> >
>


Re: Hangout starting in 5 minutes

2016-04-05 Thread John Omernik
Sorry I missed this, anything exciting happen?

On Tue, Apr 5, 2016 at 11:57 AM, Jason Altekruse  wrote:

> Anyone with an interest in Drill is welcome to attend to hear what is
> happening in the Drill community. Feel free to ask questions or just listen
> in.
>
> https://plus.google.com/hangouts/_/event/ci4rdiju8bv04a64efj5fedd0lc
>
> Jason Altekruse
> Software Engineer at Dremio
> Apache Drill Committer
>


Hangout starting in 5 minutes

2016-04-05 Thread Jason Altekruse
Anyone with an interest in Drill is welcome to attend to hear what is
happening in the Drill community. Feel free to ask questions or just listen
in.

https://plus.google.com/hangouts/_/event/ci4rdiju8bv04a64efj5fedd0lc

Jason Altekruse
Software Engineer at Dremio
Apache Drill Committer


Hangout starting in 5 minutes

2016-01-05 Thread Jason Altekruse
Come join the Drill community in our weekly hangout meeting to find out
what is going on with Drill right now.

https://plus.google.com/hangouts/_/dremio.com/drillhangout

Some items I would like to discuss this week:
- 1.5 release, issues left to fix, when would we like to target for a vote
- Drill parquet date bug: https://issues.apache.org/jira/browse/DRILL-4203

Feel free to respond with other items you would like to have discussed, or
just jump on the call.


Re: Hangout starting in 5 minutes

2016-01-05 Thread Jason Altekruse
Notes: Drill hangout - 1/5/2016

Vicky, Andries, Hakim, Aman, Julien, Jason, Charles


Drill 1.5 release thread, number of outstanding issues to solve


Parquet dates

- metadata migration may be needed for old files

- check migration tool to make sure it doesn't update already known
versions to a newer one

- can use some combination of a whitelist of known good writers as well
as checking the year

  on some of the values and that the flag to auto-correct bad dates is
set


Allocator bugs

- CTAS with partitioning was failing, sort running out of memory

- Flatten was having issues

- Functional suite

- intermittent

- Unit test, jacques did not file a bug

- sent an e-mail to parth and Hakim

- advanced suite, out of memory failures

- See if Dremio infrastructure is running advanced tests

- checked with Jacques, Dremio is not currently running the
advanced suite


Amit's branch - testing by Vicky

- blocked on the sort bug

- his fix is on top of 1.5 which fails with out of memory in sort

  before it reaches the merge join operator


Aman hash skew - Drill-4237

- sting of length 32 or more chars

- bad skew due to use of signed rather than unsigned long in the C
implementation

- revert to old hash functions

- performance of the hash is a little slower, need to measure how
much

- already tried using Guava UnsignedLong

- does not have required operations for the xxhash algorithm

- bit shifts


Hakim

- warming from parquet library

- parquet "corrupt statistics" message is showing with Drill 1.5

- issue with partitioned files

- partitioned by on date column seems to be the issue

On Tue, Jan 5, 2016 at 11:56 AM, Jason Altekruse 
wrote:

> Come join the Drill community in our weekly hangout meeting to find out
> what is going on with Drill right now.
>
> https://plus.google.com/hangouts/_/dremio.com/drillhangout
>
> Some items I would like to discuss this week:
> - 1.5 release, issues left to fix, when would we like to target for a vote
> - Drill parquet date bug: https://issues.apache.org/jira/browse/DRILL-4203
>
> Feel free to respond with other items you would like to have discussed, or
> just jump on the call.
>
>
>


Hangout starting in 5 minutes!

2015-08-18 Thread Mehant Baid

Come join the Drill community hangout as we discuss what has been happening 
lately
and what is in the pipeline. All are welcome, if you know about Drill, want
to know more or just want to listen in.

Link:https://plus.google.com/hangouts/_/event/ci4rdiju8bv04a64efj5fedd0lc

Thanks