[jira] [Resolved] (PARQUET-1482) [C++] Unable to read data from parquet file generated with parquetjs

2019-03-06 Thread Wes McKinney (JIRA)


 [ 
https://issues.apache.org/jira/browse/PARQUET-1482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney resolved PARQUET-1482.
---
Resolution: Fixed

Issue resolved by pull request 3312
[https://github.com/apache/arrow/pull/3312]

> [C++] Unable to read data from parquet file generated with parquetjs
> 
>
> Key: PARQUET-1482
> URL: https://issues.apache.org/jira/browse/PARQUET-1482
> Project: Parquet
>  Issue Type: Bug
>  Components: parquet-cpp
>Reporter: Hatem Helal
>Assignee: Rylan Dmello
>Priority: Major
>  Labels: pull-request-available
> Fix For: cpp-1.6.0
>
> Attachments: feeds1kMicros.parquet
>
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> See attached file, when I debug:
> {{% ./parquet-reader feed1kMicros.parquet}}
> I see that the {{scanner->HasNext()}} always returns false.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Parquet 1.11.0 Release to Maven Central

2019-03-06 Thread Wes McKinney
hi Masih,

Ryan responded to your e-mail shortly after you last wrote. You can
see the whole thread here:

https://lists.apache.org/thread.html/b61d362f52e712fcb6408dfb45a934c2864f47c93e43a27f60fa7597@%3Cdev.parquet.apache.org%3E

- Wes

On Wed, Mar 6, 2019 at 2:28 PM Derkani, Masih
 wrote:
>
> Hello,
>
> I'd be grateful if you can advice whether this is the right channel of 
> communication.
>
> Thanks
>
> > On 1 Mar 2019, at 17:44, Derkani, Masih  
> > wrote:
> >
> > Hi Folks,
> >
> > Is there a timeline for the publication of 1.11.0 to Maven central 
> > repository?
> > I see it was released on the repo 18 days ago.
> >
> > Is there any other maven repository that hosts the artifacts?
> >
> > Many thanks,
> > Masih
>


Re: Parquet 1.11.0 Release to Maven Central

2019-03-06 Thread Derkani, Masih
Hello,

I'd be grateful if you can advice whether this is the right channel of 
communication.

Thanks

> On 1 Mar 2019, at 17:44, Derkani, Masih  wrote:
> 
> Hi Folks,
> 
> Is there a timeline for the publication of 1.11.0 to Maven central repository?
> I see it was released on the repo 18 days ago.
> 
> Is there any other maven repository that hosts the artifacts?
> 
> Many thanks,
> Masih



Re: Column index testing break down

2019-03-06 Thread Wes McKinney
Is there anyone who might be able to take on the project of
implementing this in C++? We're having an increasing number of C++
Parquet users nowadays.

On Tue, Mar 5, 2019 at 9:54 AM Anna Szonyi  wrote:
>
> Hi dev@ community,
>
> This week I would like to ask for some feedback on the testing we've been
> sending out.
> We've been sharing the most important test cases we've created for the
> write path of the parquet column index feature, now we would like to hear
> from you!
>
> Is there anything else you feel is missing or would like to get clarity on?
>
> Thanks,
> Anna
>
> On Mon, Feb 25, 2019 at 6:26 PM Anna Szonyi  wrote:
>
> > Hi dev@,
> >
> > After a week off, this week we have an excerpt from our internal data
> > interoperability testing, which tests compatibility between Hive, Spark and
> > Impala over Avro and Parquet. This test case is tailor-made to test
> > specific layouts so that files written using parquet-mr can be read by any
> > of the above mentioned components. We have also checked fault injection
> > cases.
> >
> > The test suite is private currently, however we have made the test classes
> > corresponding to the following document public:
> > https://docs.google.com/document/d/1mHYQGXE4oM1zgg83MMc4ho1gmoJMeZcq9MWG99WgL3A
> >
> > Please find the test cases and their results here:
> > https://github.com/zivanfi/column-indexes-data-interop-tests-excerpts
> >
> > Best,
> > Anna
> >
> >
> >
> > On Mon, Feb 11, 2019 at 4:57 PM Anna Szonyi  wrote:
> >
> >> Hi dev@,
> >>
> >> Last week we had a twofer: e2e tool and integration test validating the
> >> contract of column indexes/indices (if all values are between min and max
> >> and if set whether the boundary order is correct). There are some takeaways
> >> and corrections to be made to the former (like the max->min typo) - thanks
> >> for the feedback on that!
> >>
> >> The next installment is also an integration test that tests the filtering
> >> logic on files including simple and special cases (user defined function,
> >> complex filtering, no filtering, etc.).
> >>
> >>
> >> https://github.com/apache/parquet-mr/blob/e7db9e20f52c925a207ea62d6dda6dc4e870294e/parquet-hadoop/src/test/java/org/apache/parquet/hadoop/TestColumnIndexFiltering.java
> >>
> >> Please let me know if you have any questions/comments.
> >>
> >> Best,
> >> Anna
> >>
> >>
> >>
> >>
> >>


4 Apache Events in 2019: DC Roadshow soon; next up Chicago, Las Vegas, and Berlin!

2019-03-06 Thread Rich Bowen
Dear Apache Enthusiast,

(You’re receiving this because you are subscribed to one or more user
mailing lists for an Apache Software Foundation project.)

TL;DR:
 * Apache Roadshow DC is in 3 weeks. Register now at
https://apachecon.com/usroadshowdc19/
 * Registration for Apache Roadshow Chicago is open.
http://apachecon.com/chiroadshow19
 * The CFP for ApacheCon North America is now open.
https://apachecon.com/acna19
 * Save the date: ApacheCon Europe will be held in Berlin, October 22nd
through 24th.  https://apachecon.com/aceu19


Registration is open for two Apache Roadshows; these are smaller events
with a more focused program and regional community engagement:

Our Roadshow event in Washington DC takes place in under three weeks, on
March 25th. We’ll be hosting a day-long event at the Fairfax campus of
George Mason University. The roadshow is a full day of technical talks
(two tracks) and an open source job fair featuring AWS, Bloomberg, dito,
GridGain, Linode, and Security University. More details about the
program, the job fair, and to register, visit
https://apachecon.com/usroadshowdc19/

Apache Roadshow Chicago will be held May 13-14th at a number of venues
in Chicago’s Logan Square neighborhood. This event will feature sessions
in AdTech, FinTech and Insurance, startups, “Made in Chicago”, Project
Shark Tank (innovations from the Apache Incubator), community diversity,
and more. It’s a great way to learn about various Apache projects “at
work” while playing at a brewery, a beercade, and a neighborhood bar.
Sign up today at https://www.apachecon.com/chiroadshow19/

We’re delighted to announce that the Call for Presentations (CFP) is now
open for ApacheCon North America in Las Vegas, September 9-13th! As the
official conference series of the ASF, ApacheCon North America will
feature over a dozen Apache project summits, including Cassandra,
Cloudstack, Tomcat, Traffic Control, and more. We’re looking for talks
in a wide variety of categories -- anything related to ASF projects and
the Apache development process. The CFP closes at midnight on May 26th.
In addition, the ASF will be celebrating its 20th Anniversary during the
event. For more details and to submit a proposal for the CFP, visit
https://apachecon.com/acna19/ . Registration will be opening soon.

Be sure to mark your calendars for ApacheCon Europe, which will be held
in Berlin, October 22-24th at the KulturBrauerei, a landmark of Berlin's
industrial history. In addition to innovative content from our projects,
we are collaborating with the Open Source Design community
(https://opensourcedesign.net/) to offer a track on design this year.
The CFP and registration will open soon at https://apachecon.com/aceu19/ .

Sponsorship opportunities are available for all events, with details
listed on each event’s site at http://apachecon.com/.

We look forward to seeing you!

Rich, for the ApacheCon Planners
@apachecon



[jira] [Commented] (PARQUET-1540) [C++] Set shared library version for linux and mac builds

2019-03-06 Thread Hatem Helal (JIRA)


[ 
https://issues.apache.org/jira/browse/PARQUET-1540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16785475#comment-16785475
 ] 

Hatem Helal commented on PARQUET-1540:
--

This is a duplicate of ARROW-3185

> [C++] Set shared library version for linux and mac builds
> -
>
> Key: PARQUET-1540
> URL: https://issues.apache.org/jira/browse/PARQUET-1540
> Project: Parquet
>  Issue Type: Improvement
>  Components: parquet-cpp
>Reporter: Hatem Helal
>Assignee: Hatem Helal
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> It looks like this was previously implemented when parquet-cpp was managed as 
> a separate repo (PARQUET-935).  It would be good to add this back now that 
> parquet-cpp was incorporated into the arrow project.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (PARQUET-1540) [C++] Set shared library version for linux and mac builds

2019-03-06 Thread Hatem Helal (JIRA)


 [ 
https://issues.apache.org/jira/browse/PARQUET-1540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hatem Helal resolved PARQUET-1540.
--
Resolution: Duplicate

> [C++] Set shared library version for linux and mac builds
> -
>
> Key: PARQUET-1540
> URL: https://issues.apache.org/jira/browse/PARQUET-1540
> Project: Parquet
>  Issue Type: Improvement
>  Components: parquet-cpp
>Reporter: Hatem Helal
>Assignee: Hatem Helal
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> It looks like this was previously implemented when parquet-cpp was managed as 
> a separate repo (PARQUET-935).  It would be good to add this back now that 
> parquet-cpp was incorporated into the arrow project.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)