[Impala-CR](cdh5-trunk) IMPALA-2835: introduce PARQUET_FALLBACK_SCHEMA_RESOLUTION query option

2016-03-28 Thread Skye Wanderman-Milne (Code Review)
Skye Wanderman-Milne has uploaded a new patch set (#5).

Change subject: IMPALA-2835: introduce PARQUET_FALLBACK_SCHEMA_RESOLUTION query 
option
..

IMPALA-2835: introduce PARQUET_FALLBACK_SCHEMA_RESOLUTION query option

This patch introduces a new query option,
PARQUET_FALLBACK_SCHEMA_RESOLUTION which allows Parquet files' schemas
to be resolved by either name or position.  It's "fallback" because
eventually field IDs will be the primary schema resolution scheme, and
we don't want to create an option that we will have to change the name
of later. The default is still by position. I chose to do a query
option because it will make testing easier and also be easier to
diagnose resolution problems quickly in the field. If users want to
switch the default behavior to be by name (like Hive), they can use
the --default_query_options flag.

This patch also introduces a new test section, SHELL, which can be
used to execute shell commands in a .test file. This is useful for
copying files into test tables.

Change-Id: Id0c715ea23792b2a6872610839a40532aabbb5a6
---
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/hdfs-parquet-scanner.h
M be/src/service/query-options.cc
M be/src/service/query-options.h
M common/thrift/ImpalaInternalService.thrift
M common/thrift/ImpalaService.thrift
A testdata/parquet_schema_resolution/README
A testdata/parquet_schema_resolution/switched_map.avsc
A testdata/parquet_schema_resolution/switched_map.json
A testdata/parquet_schema_resolution/switched_map.parq
A 
testdata/workloads/functional-query/queries/QueryTest/parquet-resolution-by-name.test
M tests/common/impala_test_suite.py
M tests/conftest.py
M tests/query_test/test_scanners.py
M tests/util/test_file_parser.py
15 files changed, 349 insertions(+), 18 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala refs/changes/84/2384/5
-- 
To view, visit http://gerrit.cloudera.org:8080/2384
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Id0c715ea23792b2a6872610839a40532aabbb5a6
Gerrit-PatchSet: 5
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Skye Wanderman-Milne 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Juan Yu 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Silvius Rus 
Gerrit-Reviewer: Skye Wanderman-Milne 


[Impala-CR](cdh5-trunk) IMPALA-2835: introduce PARQUET_FALLBACK_SCHEMA_RESOLUTION query option

2016-03-28 Thread Matthew Jacobs (Code Review)
Matthew Jacobs has posted comments on this change.

Change subject: IMPALA-2835: introduce PARQUET_FALLBACK_SCHEMA_RESOLUTION query 
option
..


Patch Set 5:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/2384/5/be/src/service/query-options.cc
File be/src/service/query-options.cc:

Line 371: value
case insensitive check?


http://gerrit.cloudera.org:8080/#/c/2384/5/common/thrift/ImpalaInternalService.thrift
File common/thrift/ImpalaInternalService.thrift:

Line 176: string
I'm not necessarily opposed to this being a string, but why did you do this 
over an enum as we do elsewhere?


-- 
To view, visit http://gerrit.cloudera.org:8080/2384
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Id0c715ea23792b2a6872610839a40532aabbb5a6
Gerrit-PatchSet: 5
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Skye Wanderman-Milne 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Juan Yu 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Silvius Rus 
Gerrit-Reviewer: Skye Wanderman-Milne 
Gerrit-HasComments: Yes


[Impala-CR](cdh5-trunk) IMPALA-2835: introduce PARQUET_FALLBACK_SCHEMA_RESOLUTION query option

2016-03-29 Thread Skye Wanderman-Milne (Code Review)
Skye Wanderman-Milne has posted comments on this change.

Change subject: IMPALA-2835: introduce PARQUET_FALLBACK_SCHEMA_RESOLUTION query 
option
..


Patch Set 5:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/2384/5/be/src/service/query-options.cc
File be/src/service/query-options.cc:

Line 371: value
> case insensitive check?
Done


http://gerrit.cloudera.org:8080/#/c/2384/5/common/thrift/ImpalaInternalService.thrift
File common/thrift/ImpalaInternalService.thrift:

Line 176: string
> I'm not necessarily opposed to this being a string, but why did you do this
Ah didn't see that, will change to enum


-- 
To view, visit http://gerrit.cloudera.org:8080/2384
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Id0c715ea23792b2a6872610839a40532aabbb5a6
Gerrit-PatchSet: 5
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Skye Wanderman-Milne 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Juan Yu 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Silvius Rus 
Gerrit-Reviewer: Skye Wanderman-Milne 
Gerrit-HasComments: Yes


[Impala-CR](cdh5-trunk) IMPALA-2835: introduce PARQUET_FALLBACK_SCHEMA_RESOLUTION query option

2016-03-29 Thread Skye Wanderman-Milne (Code Review)
Skye Wanderman-Milne has uploaded a new patch set (#6).

Change subject: IMPALA-2835: introduce PARQUET_FALLBACK_SCHEMA_RESOLUTION query 
option
..

IMPALA-2835: introduce PARQUET_FALLBACK_SCHEMA_RESOLUTION query option

This patch introduces a new query option,
PARQUET_FALLBACK_SCHEMA_RESOLUTION which allows Parquet files' schemas
to be resolved by either name or position.  It's "fallback" because
eventually field IDs will be the primary schema resolution scheme, and
we don't want to create an option that we will have to change the name
of later. The default is still by position. I chose to do a query
option because it will make testing easier and also be easier to
diagnose resolution problems quickly in the field. If users want to
switch the default behavior to be by name (like Hive), they can use
the --default_query_options flag.

This patch also introduces a new test section, SHELL, which can be
used to execute shell commands in a .test file. This is useful for
copying files into test tables.

Change-Id: Id0c715ea23792b2a6872610839a40532aabbb5a6
---
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/hdfs-parquet-scanner.h
M be/src/service/query-options.cc
M be/src/service/query-options.h
M common/thrift/ImpalaInternalService.thrift
M common/thrift/ImpalaService.thrift
A testdata/parquet_schema_resolution/README
A testdata/parquet_schema_resolution/switched_map.avsc
A testdata/parquet_schema_resolution/switched_map.json
A testdata/parquet_schema_resolution/switched_map.parq
A 
testdata/workloads/functional-query/queries/QueryTest/parquet-resolution-by-name.test
M tests/common/impala_test_suite.py
M tests/conftest.py
M tests/query_test/test_scanners.py
M tests/util/test_file_parser.py
15 files changed, 368 insertions(+), 18 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala refs/changes/84/2384/6
-- 
To view, visit http://gerrit.cloudera.org:8080/2384
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Id0c715ea23792b2a6872610839a40532aabbb5a6
Gerrit-PatchSet: 6
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Skye Wanderman-Milne 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Juan Yu 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Silvius Rus 
Gerrit-Reviewer: Skye Wanderman-Milne 


[Impala-CR](cdh5-trunk) IMPALA-2835: introduce PARQUET_FALLBACK_SCHEMA_RESOLUTION query option

2016-03-29 Thread Skye Wanderman-Milne (Code Review)
Skye Wanderman-Milne has uploaded a new patch set (#7).

Change subject: IMPALA-2835: introduce PARQUET_FALLBACK_SCHEMA_RESOLUTION query 
option
..

IMPALA-2835: introduce PARQUET_FALLBACK_SCHEMA_RESOLUTION query option

This patch introduces a new query option,
PARQUET_FALLBACK_SCHEMA_RESOLUTION which allows Parquet files' schemas
to be resolved by either name or position.  It's "fallback" because
eventually field IDs will be the primary schema resolution scheme, and
we don't want to create an option that we will have to change the name
of later. The default is still by position. I chose to do a query
option because it will make testing easier and also be easier to
diagnose resolution problems quickly in the field. If users want to
switch the default behavior to be by name (like Hive), they can use
the --default_query_options flag.

This patch also introduces a new test section, SHELL, which can be
used to execute shell commands in a .test file. This is useful for
copying files into test tables.

Change-Id: Id0c715ea23792b2a6872610839a40532aabbb5a6
---
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/hdfs-parquet-scanner.h
M be/src/service/query-options.cc
M be/src/service/query-options.h
M common/thrift/ImpalaInternalService.thrift
M common/thrift/ImpalaService.thrift
A testdata/parquet_schema_resolution/README
A testdata/parquet_schema_resolution/switched_map.avsc
A testdata/parquet_schema_resolution/switched_map.json
A testdata/parquet_schema_resolution/switched_map.parq
A 
testdata/workloads/functional-query/queries/QueryTest/parquet-resolution-by-name.test
M tests/common/impala_test_suite.py
M tests/conftest.py
M tests/query_test/test_scanners.py
M tests/util/test_file_parser.py
15 files changed, 389 insertions(+), 18 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala refs/changes/84/2384/7
-- 
To view, visit http://gerrit.cloudera.org:8080/2384
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Id0c715ea23792b2a6872610839a40532aabbb5a6
Gerrit-PatchSet: 7
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Skye Wanderman-Milne 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Juan Yu 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Silvius Rus 
Gerrit-Reviewer: Skye Wanderman-Milne 


[Impala-CR](cdh5-trunk) IMPALA-2835: introduce PARQUET_FALLBACK_SCHEMA_RESOLUTION query option

2016-03-29 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-2835: introduce PARQUET_FALLBACK_SCHEMA_RESOLUTION query 
option
..


Patch Set 7:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/2384/7/be/src/exec/hdfs-parquet-scanner.cc
File be/src/exec/hdfs-parquet-scanner.cc:

Line 2031: TParquetFallbackSchemaResolution::POSITION);
move this DCHECK to L2065. no reason to have a separate if-stmt for it when it 
can be incorporated into the code control flow. also, consider getting rid of 
'resolve by_name' variable.


http://gerrit.cloudera.org:8080/#/c/2384/7/be/src/service/query-options.cc
File be/src/service/query-options.cc:

Line 371: "0"
Why allow the numerical enum value? (Especially given that the enum is not 
exposed)?  I see in other options we sometimes allow it and other times don't, 
so I guess I'm okay either way but curious about the reasoning.


Line 379: position
in other statuses above, we use CAPS for the option name, no quotes, and also 
put the numerical value in parenthesis. would be nice to be consistent.  
(Though I think the parenthesis notation for the number is kinda confusing)


-- 
To view, visit http://gerrit.cloudera.org:8080/2384
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Id0c715ea23792b2a6872610839a40532aabbb5a6
Gerrit-PatchSet: 7
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Skye Wanderman-Milne 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Juan Yu 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Silvius Rus 
Gerrit-Reviewer: Skye Wanderman-Milne 
Gerrit-HasComments: Yes


[Impala-CR](cdh5-trunk) IMPALA-2835: introduce PARQUET_FALLBACK_SCHEMA_RESOLUTION query option

2016-03-29 Thread Matthew Jacobs (Code Review)
Matthew Jacobs has posted comments on this change.

Change subject: IMPALA-2835: introduce PARQUET_FALLBACK_SCHEMA_RESOLUTION query 
option
..


Patch Set 7:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/2384/7/be/src/service/query-options.cc
File be/src/service/query-options.cc:

Line 371: "0"
> Why allow the numerical enum value? (Especially given that the enum is not 
I asked her to add this. It's because TQueryOptionsToMap ends up writing out 
the values as the enum values, so this allows us to parse it back. There are 
some cases (e.g. I think TQueryOptionsToMap sends the client those strings and 
then the client may end up sending them back) where that can be a problem if we 
don't handle it.


-- 
To view, visit http://gerrit.cloudera.org:8080/2384
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Id0c715ea23792b2a6872610839a40532aabbb5a6
Gerrit-PatchSet: 7
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Skye Wanderman-Milne 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Juan Yu 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Silvius Rus 
Gerrit-Reviewer: Skye Wanderman-Milne 
Gerrit-HasComments: Yes


[Impala-CR](cdh5-trunk) IMPALA-2835: introduce PARQUET_FALLBACK_SCHEMA_RESOLUTION query option

2016-03-30 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-2835: introduce PARQUET_FALLBACK_SCHEMA_RESOLUTION query 
option
..


Patch Set 7:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/2384/7/be/src/service/query-options.cc
File be/src/service/query-options.cc:

Line 371: "0"
> I asked her to add this. It's because TQueryOptionsToMap ends up writing ou
Okay. I think a short comment near the top of this routine explaining that 
would be helpful.
Also, since these numbers must correspond to the enum values, adding this would 
be good (or make the code use the enum values directly):

DCHECK_EQ(TParquetFallbackSchemaResolution::POSITION, 0);
DCHECK_EQITParquetFallbackSchemaResolution::NAME, 1);

Not this change, but is COMPRESSION_CODEC problematic then?


-- 
To view, visit http://gerrit.cloudera.org:8080/2384
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Id0c715ea23792b2a6872610839a40532aabbb5a6
Gerrit-PatchSet: 7
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Skye Wanderman-Milne 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Juan Yu 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Silvius Rus 
Gerrit-Reviewer: Skye Wanderman-Milne 
Gerrit-HasComments: Yes


[Impala-CR](cdh5-trunk) IMPALA-2835: introduce PARQUET_FALLBACK_SCHEMA_RESOLUTION query option

2016-03-30 Thread Matthew Jacobs (Code Review)
Matthew Jacobs has posted comments on this change.

Change subject: IMPALA-2835: introduce PARQUET_FALLBACK_SCHEMA_RESOLUTION query 
option
..


Patch Set 7:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/2384/7/be/src/service/query-options.cc
File be/src/service/query-options.cc:

Line 371: "0"
> Okay. I think a short comment near the top of this routine explaining that 
Yeah these are good suggestions, and probably COMPRESSION_CODEC could be broken 
in some cases, though I wasn't able to produce an issue in a few minutes of 
playing with it. I'd like for us to find a way to avoid this issue in general. 
I thought about it briefly in the past but didn't have a good solution. I agree 
it'd be nice to add a comment at the top for now.


-- 
To view, visit http://gerrit.cloudera.org:8080/2384
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Id0c715ea23792b2a6872610839a40532aabbb5a6
Gerrit-PatchSet: 7
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Skye Wanderman-Milne 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Juan Yu 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Silvius Rus 
Gerrit-Reviewer: Skye Wanderman-Milne 
Gerrit-HasComments: Yes


[Impala-CR](cdh5-trunk) IMPALA-2835: introduce PARQUET_FALLBACK_SCHEMA_RESOLUTION query option

2016-03-30 Thread Skye Wanderman-Milne (Code Review)
Skye Wanderman-Milne has uploaded a new patch set (#8).

Change subject: IMPALA-2835: introduce PARQUET_FALLBACK_SCHEMA_RESOLUTION query 
option
..

IMPALA-2835: introduce PARQUET_FALLBACK_SCHEMA_RESOLUTION query option

This patch introduces a new query option,
PARQUET_FALLBACK_SCHEMA_RESOLUTION which allows Parquet files' schemas
to be resolved by either name or position.  It's "fallback" because
eventually field IDs will be the primary schema resolution scheme, and
we don't want to create an option that we will have to change the name
of later. The default is still by position. I chose to do a query
option because it will make testing easier and also be easier to
diagnose resolution problems quickly in the field. If users want to
switch the default behavior to be by name (like Hive), they can use
the --default_query_options flag.

This patch also introduces a new test section, SHELL, which can be
used to execute shell commands in a .test file. This is useful for
copying files into test tables.

Change-Id: Id0c715ea23792b2a6872610839a40532aabbb5a6
---
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/hdfs-parquet-scanner.h
M be/src/service/query-options.cc
M be/src/service/query-options.h
M common/thrift/ImpalaInternalService.thrift
M common/thrift/ImpalaService.thrift
A testdata/parquet_schema_resolution/README
A testdata/parquet_schema_resolution/switched_map.avsc
A testdata/parquet_schema_resolution/switched_map.json
A testdata/parquet_schema_resolution/switched_map.parq
A 
testdata/workloads/functional-query/queries/QueryTest/parquet-resolution-by-name.test
M tests/common/impala_test_suite.py
M tests/conftest.py
M tests/query_test/test_scanners.py
M tests/util/test_file_parser.py
15 files changed, 393 insertions(+), 18 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala refs/changes/84/2384/8
-- 
To view, visit http://gerrit.cloudera.org:8080/2384
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Id0c715ea23792b2a6872610839a40532aabbb5a6
Gerrit-PatchSet: 8
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Skye Wanderman-Milne 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Juan Yu 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Silvius Rus 
Gerrit-Reviewer: Skye Wanderman-Milne 


[Impala-CR](cdh5-trunk) IMPALA-2835: introduce PARQUET_FALLBACK_SCHEMA_RESOLUTION query option

2016-03-30 Thread Skye Wanderman-Milne (Code Review)
Skye Wanderman-Milne has posted comments on this change.

Change subject: IMPALA-2835: introduce PARQUET_FALLBACK_SCHEMA_RESOLUTION query 
option
..


Patch Set 7:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/2384/7/be/src/exec/hdfs-parquet-scanner.cc
File be/src/exec/hdfs-parquet-scanner.cc:

Line 2031: TParquetFallbackSchemaResolution::POSITION);
> move this DCHECK to L2065. no reason to have a separate if-stmt for it when
Done. FWIW I put this extra if statement so you don't have to read through all 
of the below to figure out there's only two options, but I'm fine with moving 
it. I'm gonna keep 'resolve_by_name' since I use it on L2078.


http://gerrit.cloudera.org:8080/#/c/2384/7/be/src/service/query-options.cc
File be/src/service/query-options.cc:

Line 371: "0"
> Yeah these are good suggestions, and probably COMPRESSION_CODEC could be br
I added the comment and compare directly against the enum values.


Line 379: position
> in other statuses above, we use CAPS for the option name, no quotes, and al
I used caps, but left out the numerical values since they're not really for 
users.


-- 
To view, visit http://gerrit.cloudera.org:8080/2384
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Id0c715ea23792b2a6872610839a40532aabbb5a6
Gerrit-PatchSet: 7
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Skye Wanderman-Milne 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Juan Yu 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Silvius Rus 
Gerrit-Reviewer: Skye Wanderman-Milne 
Gerrit-HasComments: Yes


[Impala-CR](cdh5-trunk) IMPALA-2835: introduce PARQUET_FALLBACK_SCHEMA_RESOLUTION query option

2016-03-30 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-2835: introduce PARQUET_FALLBACK_SCHEMA_RESOLUTION query 
option
..


Patch Set 8: Code-Review+2

(4 comments)

Please see if Matt wanted to make another pass before committing.

http://gerrit.cloudera.org:8080/#/c/2384/8/be/src/exec/hdfs-parquet-scanner.cc
File be/src/exec/hdfs-parquet-scanner.cc:

Line 2056: ordinal
position


Line 2061: ordinal
position (just so we have a consistent terminology).


http://gerrit.cloudera.org:8080/#/c/2384/8/testdata/workloads/functional-query/queries/QueryTest/parquet-resolution-by-name.test
File 
testdata/workloads/functional-query/queries/QueryTest/parquet-resolution-by-name.test:

Line 203:  QUERY
the comments for the other queries were helpful. how about one here.


Line 213:  QUERY
and here, to explain what is being tested.


-- 
To view, visit http://gerrit.cloudera.org:8080/2384
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Id0c715ea23792b2a6872610839a40532aabbb5a6
Gerrit-PatchSet: 8
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Skye Wanderman-Milne 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Juan Yu 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Silvius Rus 
Gerrit-Reviewer: Skye Wanderman-Milne 
Gerrit-HasComments: Yes


[Impala-CR](cdh5-trunk) IMPALA-2835: introduce PARQUET_FALLBACK_SCHEMA_RESOLUTION query option

2016-03-30 Thread Skye Wanderman-Milne (Code Review)
Skye Wanderman-Milne has posted comments on this change.

Change subject: IMPALA-2835: introduce PARQUET_FALLBACK_SCHEMA_RESOLUTION query 
option
..


Patch Set 8:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/2384/8/be/src/exec/hdfs-parquet-scanner.cc
File be/src/exec/hdfs-parquet-scanner.cc:

Line 2056: ordinal
> position
Done


Line 2061: ordinal
> position (just so we have a consistent terminology).
Done


http://gerrit.cloudera.org:8080/#/c/2384/8/testdata/workloads/functional-query/queries/QueryTest/parquet-resolution-by-name.test
File 
testdata/workloads/functional-query/queries/QueryTest/parquet-resolution-by-name.test:

Line 203:  QUERY
> the comments for the other queries were helpful. how about one here.
Done


Line 213:  QUERY
> and here, to explain what is being tested.
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/2384
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Id0c715ea23792b2a6872610839a40532aabbb5a6
Gerrit-PatchSet: 8
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Skye Wanderman-Milne 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Juan Yu 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Silvius Rus 
Gerrit-Reviewer: Skye Wanderman-Milne 
Gerrit-HasComments: Yes


[Impala-CR](cdh5-trunk) IMPALA-2835: introduce PARQUET_FALLBACK_SCHEMA_RESOLUTION query option

2016-03-30 Thread Skye Wanderman-Milne (Code Review)
Hello Dan Hecht,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/2384

to look at the new patch set (#9).

Change subject: IMPALA-2835: introduce PARQUET_FALLBACK_SCHEMA_RESOLUTION query 
option
..

IMPALA-2835: introduce PARQUET_FALLBACK_SCHEMA_RESOLUTION query option

This patch introduces a new query option,
PARQUET_FALLBACK_SCHEMA_RESOLUTION which allows Parquet files' schemas
to be resolved by either name or position.  It's "fallback" because
eventually field IDs will be the primary schema resolution scheme, and
we don't want to create an option that we will have to change the name
of later. The default is still by position. I chose to do a query
option because it will make testing easier and also be easier to
diagnose resolution problems quickly in the field. If users want to
switch the default behavior to be by name (like Hive), they can use
the --default_query_options flag.

This patch also introduces a new test section, SHELL, which can be
used to execute shell commands in a .test file. This is useful for
copying files into test tables.

Change-Id: Id0c715ea23792b2a6872610839a40532aabbb5a6
---
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/hdfs-parquet-scanner.h
M be/src/service/query-options.cc
M be/src/service/query-options.h
M common/thrift/ImpalaInternalService.thrift
M common/thrift/ImpalaService.thrift
A testdata/parquet_schema_resolution/README
A testdata/parquet_schema_resolution/switched_map.avsc
A testdata/parquet_schema_resolution/switched_map.json
A testdata/parquet_schema_resolution/switched_map.parq
A 
testdata/workloads/functional-query/queries/QueryTest/parquet-resolution-by-name.test
M tests/common/impala_test_suite.py
M tests/conftest.py
M tests/query_test/test_scanners.py
M tests/util/test_file_parser.py
15 files changed, 395 insertions(+), 18 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala refs/changes/84/2384/9
-- 
To view, visit http://gerrit.cloudera.org:8080/2384
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Id0c715ea23792b2a6872610839a40532aabbb5a6
Gerrit-PatchSet: 9
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Skye Wanderman-Milne 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Juan Yu 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Silvius Rus 
Gerrit-Reviewer: Skye Wanderman-Milne 


[Impala-CR](cdh5-trunk) IMPALA-2835: introduce PARQUET_FALLBACK_SCHEMA_RESOLUTION query option

2016-03-30 Thread Matthew Jacobs (Code Review)
Matthew Jacobs has posted comments on this change.

Change subject: IMPALA-2835: introduce PARQUET_FALLBACK_SCHEMA_RESOLUTION query 
option
..


Patch Set 9: Code-Review+1

thanks!

-- 
To view, visit http://gerrit.cloudera.org:8080/2384
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Id0c715ea23792b2a6872610839a40532aabbb5a6
Gerrit-PatchSet: 9
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Skye Wanderman-Milne 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Juan Yu 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Silvius Rus 
Gerrit-Reviewer: Skye Wanderman-Milne 
Gerrit-HasComments: No


[Impala-CR](cdh5-trunk) IMPALA-2835: introduce PARQUET_FALLBACK_SCHEMA_RESOLUTION query option

2016-03-31 Thread Skye Wanderman-Milne (Code Review)
Hello Matthew Jacobs, Dan Hecht,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/2384

to look at the new patch set (#10).

Change subject: IMPALA-2835: introduce PARQUET_FALLBACK_SCHEMA_RESOLUTION query 
option
..

IMPALA-2835: introduce PARQUET_FALLBACK_SCHEMA_RESOLUTION query option

This patch introduces a new query option,
PARQUET_FALLBACK_SCHEMA_RESOLUTION which allows Parquet files' schemas
to be resolved by either name or position.  It's "fallback" because
eventually field IDs will be the primary schema resolution scheme, and
we don't want to create an option that we will have to change the name
of later. The default is still by position. I chose to do a query
option because it will make testing easier and also be easier to
diagnose resolution problems quickly in the field. If users want to
switch the default behavior to be by name (like Hive), they can use
the --default_query_options flag.

This patch also introduces a new test section, SHELL, which can be
used to execute shell commands in a .test file. This is useful for
copying files into test tables.

Change-Id: Id0c715ea23792b2a6872610839a40532aabbb5a6
---
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/hdfs-parquet-scanner.h
M be/src/service/query-options.cc
M be/src/service/query-options.h
M common/thrift/ImpalaInternalService.thrift
M common/thrift/ImpalaService.thrift
A testdata/parquet_schema_resolution/README
A testdata/parquet_schema_resolution/switched_map.avsc
A testdata/parquet_schema_resolution/switched_map.json
A testdata/parquet_schema_resolution/switched_map.parq
A 
testdata/workloads/functional-query/queries/QueryTest/parquet-resolution-by-name.test
M tests/common/impala_test_suite.py
M tests/conftest.py
M tests/query_test/test_scanners.py
M tests/util/test_file_parser.py
15 files changed, 395 insertions(+), 18 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala refs/changes/84/2384/10
-- 
To view, visit http://gerrit.cloudera.org:8080/2384
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Id0c715ea23792b2a6872610839a40532aabbb5a6
Gerrit-PatchSet: 10
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Skye Wanderman-Milne 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Juan Yu 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Silvius Rus 
Gerrit-Reviewer: Skye Wanderman-Milne 


[Impala-CR](cdh5-trunk) IMPALA-2835: introduce PARQUET_FALLBACK_SCHEMA_RESOLUTION query option

2016-03-31 Thread Skye Wanderman-Milne (Code Review)
Hello Matthew Jacobs, Dan Hecht,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/2384

to look at the new patch set (#11).

Change subject: IMPALA-2835: introduce PARQUET_FALLBACK_SCHEMA_RESOLUTION query 
option
..

IMPALA-2835: introduce PARQUET_FALLBACK_SCHEMA_RESOLUTION query option

This patch introduces a new query option,
PARQUET_FALLBACK_SCHEMA_RESOLUTION which allows Parquet files' schemas
to be resolved by either name or position.  It's "fallback" because
eventually field IDs will be the primary schema resolution scheme, and
we don't want to create an option that we will have to change the name
of later. The default is still by position. I chose to do a query
option because it will make testing easier and also be easier to
diagnose resolution problems quickly in the field. If users want to
switch the default behavior to be by name (like Hive), they can use
the --default_query_options flag.

This patch also introduces a new test section, SHELL, which can be
used to execute shell commands in a .test file. This is useful for
copying files into test tables.

Change-Id: Id0c715ea23792b2a6872610839a40532aabbb5a6
---
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/hdfs-parquet-scanner.h
M be/src/service/query-options.cc
M be/src/service/query-options.h
M common/thrift/ImpalaInternalService.thrift
M common/thrift/ImpalaService.thrift
A testdata/parquet_schema_resolution/README
A testdata/parquet_schema_resolution/switched_map.avsc
A testdata/parquet_schema_resolution/switched_map.json
A testdata/parquet_schema_resolution/switched_map.parq
A 
testdata/workloads/functional-query/queries/QueryTest/parquet-resolution-by-name.test
M tests/common/impala_test_suite.py
M tests/conftest.py
M tests/query_test/test_scanners.py
M tests/util/test_file_parser.py
15 files changed, 395 insertions(+), 18 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala refs/changes/84/2384/11
-- 
To view, visit http://gerrit.cloudera.org:8080/2384
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Id0c715ea23792b2a6872610839a40532aabbb5a6
Gerrit-PatchSet: 11
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Skye Wanderman-Milne 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Juan Yu 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Silvius Rus 
Gerrit-Reviewer: Skye Wanderman-Milne 


[Impala-CR](cdh5-trunk) IMPALA-2835: introduce PARQUET_FALLBACK_SCHEMA_RESOLUTION query option

2016-04-01 Thread Skye Wanderman-Milne (Code Review)
Hello Matthew Jacobs, Dan Hecht,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/2384

to look at the new patch set (#12).

Change subject: IMPALA-2835: introduce PARQUET_FALLBACK_SCHEMA_RESOLUTION query 
option
..

IMPALA-2835: introduce PARQUET_FALLBACK_SCHEMA_RESOLUTION query option

This patch introduces a new query option,
PARQUET_FALLBACK_SCHEMA_RESOLUTION which allows Parquet files' schemas
to be resolved by either name or position.  It's "fallback" because
eventually field IDs will be the primary schema resolution scheme, and
we don't want to create an option that we will have to change the name
of later. The default is still by position. I chose to do a query
option because it will make testing easier and also be easier to
diagnose resolution problems quickly in the field. If users want to
switch the default behavior to be by name (like Hive), they can use
the --default_query_options flag.

This patch also introduces a new test section, SHELL, which can be
used to execute shell commands in a .test file. This is useful for
copying files into test tables.

Change-Id: Id0c715ea23792b2a6872610839a40532aabbb5a6
---
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/hdfs-parquet-scanner.h
M be/src/service/query-options.cc
M be/src/service/query-options.h
M common/thrift/ImpalaInternalService.thrift
M common/thrift/ImpalaService.thrift
A testdata/parquet_schema_resolution/README
A testdata/parquet_schema_resolution/switched_map.avsc
A testdata/parquet_schema_resolution/switched_map.json
A testdata/parquet_schema_resolution/switched_map.parq
A 
testdata/workloads/functional-query/queries/QueryTest/parquet-resolution-by-name.test
M tests/common/impala_test_suite.py
M tests/conftest.py
M tests/query_test/test_scanners.py
M tests/util/test_file_parser.py
15 files changed, 395 insertions(+), 18 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala refs/changes/84/2384/12
-- 
To view, visit http://gerrit.cloudera.org:8080/2384
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Id0c715ea23792b2a6872610839a40532aabbb5a6
Gerrit-PatchSet: 12
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Skye Wanderman-Milne 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Juan Yu 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Silvius Rus 
Gerrit-Reviewer: Skye Wanderman-Milne 


[Impala-CR](cdh5-trunk) IMPALA-2835: introduce PARQUET_FALLBACK_SCHEMA_RESOLUTION query option

2016-04-01 Thread Skye Wanderman-Milne (Code Review)
Skye Wanderman-Milne has posted comments on this change.

Change subject: IMPALA-2835: introduce PARQUET_FALLBACK_SCHEMA_RESOLUTION query 
option
..


Patch Set 12: Code-Review+2

rebase

-- 
To view, visit http://gerrit.cloudera.org:8080/2384
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Id0c715ea23792b2a6872610839a40532aabbb5a6
Gerrit-PatchSet: 12
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Skye Wanderman-Milne 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Juan Yu 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Silvius Rus 
Gerrit-Reviewer: Skye Wanderman-Milne 
Gerrit-HasComments: No


[Impala-CR](cdh5-trunk) IMPALA-2835: introduce PARQUET_FALLBACK_SCHEMA_RESOLUTION query option

2016-04-01 Thread Internal Jenkins (Code Review)
Internal Jenkins has submitted this change and it was merged.

Change subject: IMPALA-2835: introduce PARQUET_FALLBACK_SCHEMA_RESOLUTION query 
option
..


IMPALA-2835: introduce PARQUET_FALLBACK_SCHEMA_RESOLUTION query option

This patch introduces a new query option,
PARQUET_FALLBACK_SCHEMA_RESOLUTION which allows Parquet files' schemas
to be resolved by either name or position.  It's "fallback" because
eventually field IDs will be the primary schema resolution scheme, and
we don't want to create an option that we will have to change the name
of later. The default is still by position. I chose to do a query
option because it will make testing easier and also be easier to
diagnose resolution problems quickly in the field. If users want to
switch the default behavior to be by name (like Hive), they can use
the --default_query_options flag.

This patch also introduces a new test section, SHELL, which can be
used to execute shell commands in a .test file. This is useful for
copying files into test tables.

Change-Id: Id0c715ea23792b2a6872610839a40532aabbb5a6
Reviewed-on: http://gerrit.cloudera.org:8080/2384
Reviewed-by: Skye Wanderman-Milne 
Tested-by: Internal Jenkins
---
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/hdfs-parquet-scanner.h
M be/src/service/query-options.cc
M be/src/service/query-options.h
M common/thrift/ImpalaInternalService.thrift
M common/thrift/ImpalaService.thrift
A testdata/parquet_schema_resolution/README
A testdata/parquet_schema_resolution/switched_map.avsc
A testdata/parquet_schema_resolution/switched_map.json
A testdata/parquet_schema_resolution/switched_map.parq
A 
testdata/workloads/functional-query/queries/QueryTest/parquet-resolution-by-name.test
M tests/common/impala_test_suite.py
M tests/conftest.py
M tests/query_test/test_scanners.py
M tests/util/test_file_parser.py
15 files changed, 395 insertions(+), 18 deletions(-)

Approvals:
  Internal Jenkins: Verified
  Skye Wanderman-Milne: Looks good to me, approved



-- 
To view, visit http://gerrit.cloudera.org:8080/2384
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: Id0c715ea23792b2a6872610839a40532aabbb5a6
Gerrit-PatchSet: 13
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Skye Wanderman-Milne 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Juan Yu 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Silvius Rus 
Gerrit-Reviewer: Skye Wanderman-Milne 


Re: [Impala-CR](cdh5-trunk) IMPALA-2835: introduce PARQUET_FALLBACK_SCHEMA_RESOLUTION query option

2016-03-29 Thread Matthew Jacobs
Can you make sure to parse the integer enum values as well as the string
names?
On Tue, Mar 29, 2016 at 2:36 PM Skye Wanderman-Milne (Code Review) <
ger...@cloudera.org> wrote:

> Skye Wanderman-Milne has uploaded a new patch set (#6).
>
> Change subject: IMPALA-2835: introduce PARQUET_FALLBACK_SCHEMA_RESOLUTION
> query option
> ..
>
> IMPALA-2835: introduce PARQUET_FALLBACK_SCHEMA_RESOLUTION query option
>
> This patch introduces a new query option,
> PARQUET_FALLBACK_SCHEMA_RESOLUTION which allows Parquet files' schemas
> to be resolved by either name or position.  It's "fallback" because
> eventually field IDs will be the primary schema resolution scheme, and
> we don't want to create an option that we will have to change the name
> of later. The default is still by position. I chose to do a query
> option because it will make testing easier and also be easier to
> diagnose resolution problems quickly in the field. If users want to
> switch the default behavior to be by name (like Hive), they can use
> the --default_query_options flag.
>
> This patch also introduces a new test section, SHELL, which can be
> used to execute shell commands in a .test file. This is useful for
> copying files into test tables.
>
> Change-Id: Id0c715ea23792b2a6872610839a40532aabbb5a6
> ---
> M be/src/exec/hdfs-parquet-scanner.cc
> M be/src/exec/hdfs-parquet-scanner.h
> M be/src/service/query-options.cc
> M be/src/service/query-options.h
> M common/thrift/ImpalaInternalService.thrift
> M common/thrift/ImpalaService.thrift
> A testdata/parquet_schema_resolution/README
> A testdata/parquet_schema_resolution/switched_map.avsc
> A testdata/parquet_schema_resolution/switched_map.json
> A testdata/parquet_schema_resolution/switched_map.parq
> A
> testdata/workloads/functional-query/queries/QueryTest/parquet-resolution-by-name.test
> M tests/common/impala_test_suite.py
> M tests/conftest.py
> M tests/query_test/test_scanners.py
> M tests/util/test_file_parser.py
> 15 files changed, 368 insertions(+), 18 deletions(-)
>
>
>   git pull ssh://gerrit.cloudera.org:29418/Impala refs/changes/84/2384/6
> --
> To view, visit http://gerrit.cloudera.org:8080/2384
> To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
>
> Gerrit-MessageType: newpatchset
> Gerrit-Change-Id: Id0c715ea23792b2a6872610839a40532aabbb5a6
> Gerrit-PatchSet: 6
> Gerrit-Project: Impala
> Gerrit-Branch: cdh5-trunk
> Gerrit-Owner: Skye Wanderman-Milne 
> Gerrit-Reviewer: Dan Hecht 
> Gerrit-Reviewer: Juan Yu 
> Gerrit-Reviewer: Matthew Jacobs 
> Gerrit-Reviewer: Michael Ho 
> Gerrit-Reviewer: Silvius Rus 
> Gerrit-Reviewer: Skye Wanderman-Milne 
>


Re: [Impala-CR](cdh5-trunk) IMPALA-2835: introduce PARQUET_FALLBACK_SCHEMA_RESOLUTION query option

2016-03-29 Thread Skye Wanderman-Milne
Yes, will add to test

On Tue, Mar 29, 2016 at 3:55 PM, Matthew Jacobs  wrote:

> Can you make sure to parse the integer enum values as well as the string
> names?
>
> On Tue, Mar 29, 2016 at 2:36 PM Skye Wanderman-Milne (Code Review) <
> ger...@cloudera.org> wrote:
>
>> Skye Wanderman-Milne has uploaded a new patch set (#6).
>>
>> Change subject: IMPALA-2835: introduce PARQUET_FALLBACK_SCHEMA_RESOLUTION
>> query option
>> ..
>>
>> IMPALA-2835: introduce PARQUET_FALLBACK_SCHEMA_RESOLUTION query option
>>
>> This patch introduces a new query option,
>> PARQUET_FALLBACK_SCHEMA_RESOLUTION which allows Parquet files' schemas
>> to be resolved by either name or position.  It's "fallback" because
>> eventually field IDs will be the primary schema resolution scheme, and
>> we don't want to create an option that we will have to change the name
>> of later. The default is still by position. I chose to do a query
>> option because it will make testing easier and also be easier to
>> diagnose resolution problems quickly in the field. If users want to
>> switch the default behavior to be by name (like Hive), they can use
>> the --default_query_options flag.
>>
>> This patch also introduces a new test section, SHELL, which can be
>> used to execute shell commands in a .test file. This is useful for
>> copying files into test tables.
>>
>> Change-Id: Id0c715ea23792b2a6872610839a40532aabbb5a6
>> ---
>> M be/src/exec/hdfs-parquet-scanner.cc
>> M be/src/exec/hdfs-parquet-scanner.h
>> M be/src/service/query-options.cc
>> M be/src/service/query-options.h
>> M common/thrift/ImpalaInternalService.thrift
>> M common/thrift/ImpalaService.thrift
>> A testdata/parquet_schema_resolution/README
>> A testdata/parquet_schema_resolution/switched_map.avsc
>> A testdata/parquet_schema_resolution/switched_map.json
>> A testdata/parquet_schema_resolution/switched_map.parq
>> A
>> testdata/workloads/functional-query/queries/QueryTest/parquet-resolution-by-name.test
>> M tests/common/impala_test_suite.py
>> M tests/conftest.py
>> M tests/query_test/test_scanners.py
>> M tests/util/test_file_parser.py
>> 15 files changed, 368 insertions(+), 18 deletions(-)
>>
>>
>>   git pull ssh://gerrit.cloudera.org:29418/Impala refs/changes/84/2384/6
>> --
>> To view, visit http://gerrit.cloudera.org:8080/2384
>> To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
>>
>> Gerrit-MessageType: newpatchset
>> Gerrit-Change-Id: Id0c715ea23792b2a6872610839a40532aabbb5a6
>> Gerrit-PatchSet: 6
>> Gerrit-Project: Impala
>> Gerrit-Branch: cdh5-trunk
>> Gerrit-Owner: Skye Wanderman-Milne 
>> Gerrit-Reviewer: Dan Hecht 
>> Gerrit-Reviewer: Juan Yu 
>> Gerrit-Reviewer: Matthew Jacobs 
>> Gerrit-Reviewer: Michael Ho 
>> Gerrit-Reviewer: Silvius Rus 
>> Gerrit-Reviewer: Skye Wanderman-Milne 
>>
>