[jira] [Created] (HIVE-16465) NullPointer Exception when enable vectorization for Parquet file format
Colin Ma created HIVE-16465: --- Summary: NullPointer Exception when enable vectorization for Parquet file format Key: HIVE-16465 URL: https://issues.apache.org/jira/browse/HIVE-16465 Project: Hive Issue Type: Bug Reporter: Colin Ma Assignee: Colin Ma Priority: Critical NullPointer Exception when enable vectorization for Parquet file format. It is caused by the null value of the InputSplit. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HIVE-16463) Change license for java transaction jta jar to CDDL 1.0
Alan Gates created HIVE-16463: - Summary: Change license for java transaction jta jar to CDDL 1.0 Key: HIVE-16463 URL: https://issues.apache.org/jira/browse/HIVE-16463 Project: Hive Issue Type: Bug Reporter: Alan Gates Assignee: Alan Gates Previously I erroneously said that this jar was under the SCSL 3.0 license. But further research has shown I was wrong and it is released under CDDL 1.0. So we need to change the license file for this jar in the binaries directory. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HIVE-16462) Vectorization: Enabling hybrid grace disables specialization of all reduce side joins
Jason Dere created HIVE-16462: - Summary: Vectorization: Enabling hybrid grace disables specialization of all reduce side joins Key: HIVE-16462 URL: https://issues.apache.org/jira/browse/HIVE-16462 Project: Hive Issue Type: Bug Components: Vectorization Reporter: Jason Dere Assignee: Jason Dere Observed by [~gopalv]. Having grace hash join enabled prevents the specialized vector hash joins during the vectorizer stage of query planning. However hive.llap.enable.grace.join.in.llap will later disable grace hash join (LlapDecider runs after Vectorizer). If we can disable the grace hash join before vectorization kicks in then we can still benefit from the specialized vector hash joins. This can be special cased for the llap.execution.mode=only case. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HIVE-16461) DagUtils checks local resource size on the remote fs
Sergey Shelukhin created HIVE-16461: --- Summary: DagUtils checks local resource size on the remote fs Key: HIVE-16461 URL: https://issues.apache.org/jira/browse/HIVE-16461 Project: Hive Issue Type: Bug Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin The path for local file may have no schema. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HIVE-16460) In the console output, show vertex list in topological order instead of an alphabetical sort
Siddharth Seth created HIVE-16460: - Summary: In the console output, show vertex list in topological order instead of an alphabetical sort Key: HIVE-16460 URL: https://issues.apache.org/jira/browse/HIVE-16460 Project: Hive Issue Type: Improvement Reporter: Siddharth Seth cc [~prasanth_j] -- This message was sent by Atlassian JIRA (v6.3.15#6346)
Re: Do you feel a need for schema when querying JSON files in hive?
So no one knows about this ? I was hoping to use some knowledge already acquired on this subject :( On Tue, Apr 11, 2017 at 2:09 AM, S G wrote: > Hi, > > There is a concept of JsonSerDe where you need to specify a structure for > your tables in order to query them. > > However, since the schema for an object is prone to change (once every few > months is not unexpected), how do you handle that change in your hive/pig > queries? > > Moreover, since JSON files are not demarcated according to schema, it is > possible that a single JSON file has json-data for multiple evolutions of a > schema (Like 10 objects of ClassAnimal1, 20 of ClassAnimal2, 100 of > ClassAnimal3 etc where ClassAnimal1, ClassAnimal2 and ClassAnimal3 > represent schema for ClassAnimal at different times). > > For such a JSON file, what is the recommended way of querying? > > I know that Avro solves this problem by maintaining a single file for a > single-kind of schema. So it will have 3 files for the above case, 1 each > for ClassAnimal1, ClassAnimal2 and ClassAnimal3) > > But since Avro is binary, hard to debug and requires a schema-repository > (for non-hive use-cases), we were hoping to solve this problem in JSON. > > Related questions: > 1) Is it even a problem worth solving? > 2) How many people use AvroSerDe as compared to JsonSerDe? > > Thanks > SG > >