[
https://issues.apache.org/jira/browse/PARQUET-334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Thomas Friedrich reassigned PARQUET-334:
Assignee: Thomas Friedrich
> UT TestSummary failed with "java.lang.RuntimeExceptio
I got the file, I should have time to look at it today.
On Mon, Dec 7, 2015 at 3:05 PM, Daniel Weeks
wrote:
> I sent Jason a file that can reproduce the issue with just 1K lines in it.
>
> If you want, I can open a JIRA and attach the file.
>
> 5a45ae3b1deb5117cb9e9a13141eeab1e9ad3d71 Can read t
I sent Jason a file that can reproduce the issue with just 1K lines in it.
If you want, I can open a JIRA and attach the file.
5a45ae3b1deb5117cb9e9a13141eeab1e9ad3d71 Can read the file without issue
6b605a4ea05b66e1a6bf843353abcb4834a4ced8 (bytebuffer) cannot read the file
-Dan
On Mon, Dec 7,
On 12/07/2015 02:23 PM, Stephen Bly wrote:
Thank you all for you help, I think I know what I need to do now. At some point
maybe I can contribute to the Parquet project to allow Hive to access columns
by looking at the stored ID field instead of the field name.
That would be great! Let us kn
In the meantime if you have the stacktrace for this error that would help
too.
On Fri, Dec 4, 2015 at 1:59 PM, Jason Altekruse
wrote:
> I assume that the buffer that we are giving to thrift doesn't have the
> header in it at the expected position. We hadn't seen this error in any of
> our regres
To follow-up, after discussing with more senior engineers at my company:
I misread what Julien said in regards to accessing by column index. I thought
this was equivalent to Thrift ID, but now I understand what he actually meant,
and that solution is unfortunately not viable for our use case.
I
Thanks Cheng!
Here is a useful blog post:
http://grepalex.com/2014/05/13/parquet-file-format-and-object-model/
about 2.
On Sun, Dec 6, 2015 at 9:52 PM, Cheng Lian wrote:
> cc parquet-dev list (it would be nice to always do so for these general
> questions.)
>
> Cheng
>
> On 12/6/15 3:10 PM, Shus
On 12/07/2015 11:21 AM, Stephen Bly wrote:
Thank you all for your detailed responses. Let me make sure I have this right:
I can write the Parquet file in any way I want, including using our own custom
Thrift code. Hive does not care, because it will used the schema stored in the
Parquet file t
Thank you all for your detailed responses. Let me make sure I have this right:
I can write the Parquet file in any way I want, including using our own custom
Thrift code. Hive does not care, because it will used the schema stored in the
Parquet file together with the schema I specified when crea
(CC'ing some folks who may have more context)
There's a setting to make the column lookup by index instead (ignoring
names in the file):
https://github.com/apache/parquet-mr/search?utf8=%E2%9C%93&q=PARQUET_COLUMN_INDEX_ACCESS
(remember that the serde code has moved to hive itself so parquet-hive is
Hi Stephen,
I'm not sure I follow your scenario.
So you have your own Thrift generator; you then write (outside of Hive,
using your own input/output format) the Thrift objects out into HDFS using
your own custom Parquet output format. You have two questiosn:
1. Is there anything special you need to
Hi Stephen,
Good questions. I think there is a slight misunderstanding with some of
the components, so I'll go over how they relate to one another first.
There are several different object models -- ways of working with data
in memory -- including Thrift, Hive, and Avro (to name just one othe
I've been using the two interchangably. It depends on the order. If you
have an element e and a set S, then you can test if "e in S" or "S
contains e". They are both the same operation, but expressed in terms of
the element first or the set first.
rb
On 12/05/2015 08:25 AM, Flavio Pompermaier
Greetings Parquet experts. I am in need of a little help.
I am a (very) Junior developer at my company, and I have been tasked with
adding the Parquet file format to our Hadoop ecosystem. Our main use case is in
creating Hive tables on Parquet data and querying them.
As you know, Hive can creat
André Kelpe created PARQUET-399:
---
Summary: website should list mailing list adresses
Key: PARQUET-399
URL: https://issues.apache.org/jira/browse/PARQUET-399
Project: Parquet
Issue Type: Improve
15 matches
Mail list logo