[jira] [Updated] (PARQUET-511) Integer overflow on counting values in column

2018-04-21 Thread Gabor Szadovszky (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky updated PARQUET-511: - Fix Version/s: 1.8.2 > Integer overflow on counting values in column > --

[jira] [Updated] (PARQUET-571) Fix potential leak in ParquetFileReader.close()

2018-04-21 Thread Gabor Szadovszky (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky updated PARQUET-571: - Fix Version/s: 1.8.2 > Fix potential leak in ParquetFileReader.close() >

[jira] [Updated] (PARQUET-382) Add a way to append encoded blocks in ParquetFileWriter

2018-04-21 Thread Gabor Szadovszky (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky updated PARQUET-382: - Fix Version/s: 1.8.2 > Add a way to append encoded blocks in ParquetFileWriter >

[jira] [Updated] (PARQUET-335) Avro object model should not require MAP_KEY_VALUE

2018-04-21 Thread Gabor Szadovszky (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky updated PARQUET-335: - Fix Version/s: 1.8.2 > Avro object model should not require MAP_KEY_VALUE > -

[jira] [Updated] (PARQUET-423) Make writing Avro to Parquet less noisy

2018-04-21 Thread Gabor Szadovszky (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky updated PARQUET-423: - Fix Version/s: 1.8.2 > Make writing Avro to Parquet less noisy >

[jira] [Updated] (PARQUET-378) Add thoroughly parquet test encodings

2018-04-21 Thread Gabor Szadovszky (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky updated PARQUET-378: - Fix Version/s: 1.8.2 > Add thoroughly parquet test encodings > --

[jira] [Updated] (PARQUET-387) TwoLevelListWriter does not handle null values in array

2018-04-21 Thread Gabor Szadovszky (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky updated PARQUET-387: - Fix Version/s: 1.8.2 > TwoLevelListWriter does not handle null values in array >

[jira] [Updated] (PARQUET-581) Min/max row count for page size check are conflated in some places

2018-04-21 Thread Gabor Szadovszky (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky updated PARQUET-581: - Fix Version/s: 1.8.2 > Min/max row count for page size check are conflated in some places

[jira] [Updated] (PARQUET-623) DeltaByteArrayReader has incorrect skip behaviour

2018-04-21 Thread Gabor Szadovszky (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky updated PARQUET-623: - Fix Version/s: 1.8.2 > DeltaByteArrayReader has incorrect skip behaviour > --

[jira] [Updated] (PARQUET-220) Unnecessary warning in ParquetRecordReader.initialize

2018-04-21 Thread Gabor Szadovszky (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky updated PARQUET-220: - Fix Version/s: 1.8.2 > Unnecessary warning in ParquetRecordReader.initialize > --

[jira] [Updated] (PARQUET-660) Writing Protobuf messages with extensions results in an error or data corruption.

2018-04-21 Thread Gabor Szadovszky (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky updated PARQUET-660: - Fix Version/s: 1.8.2 > Writing Protobuf messages with extensions results in an error or d

[jira] [Updated] (PARQUET-380) Cascading and scrooge builds fail when using thrift 0.9.0

2018-04-21 Thread Gabor Szadovszky (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky updated PARQUET-380: - Fix Version/s: 1.8.2 > Cascading and scrooge builds fail when using thrift 0.9.0 > --

[jira] [Updated] (PARQUET-373) MemoryManager tests are flaky

2018-04-21 Thread Gabor Szadovszky (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky updated PARQUET-373: - Fix Version/s: 1.8.2 > MemoryManager tests are flaky > - > >

[jira] [Updated] (PARQUET-645) DictionaryFilter incorrectly handles null

2018-04-21 Thread Gabor Szadovszky (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky updated PARQUET-645: - Fix Version/s: 1.8.2 > DictionaryFilter incorrectly handles null > --

[jira] [Updated] (PARQUET-783) H2SeekableInputStream does not close its underlying FSDataInputStream, leading to connection leaks

2018-04-21 Thread Gabor Szadovszky (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky updated PARQUET-783: - Fix Version/s: 1.8.2 > H2SeekableInputStream does not close its underlying FSDataInputStr

[jira] [Updated] (PARQUET-753) GroupType.union() doesn't merge the original type

2018-04-21 Thread Gabor Szadovszky (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky updated PARQUET-753: - Fix Version/s: 1.8.2 > GroupType.union() doesn't merge the original type > --

[jira] [Updated] (PARQUET-686) Allow for Unsigned Statistics in Binary Type

2018-04-21 Thread Gabor Szadovszky (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-686?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky updated PARQUET-686: - Fix Version/s: 1.8.2 > Allow for Unsigned Statistics in Binary Type > ---

[jira] [Updated] (PARQUET-361) Add prerelease logic to semantic versions

2018-04-21 Thread Gabor Szadovszky (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky updated PARQUET-361: - Fix Version/s: 1.8.2 > Add prerelease logic to semantic versions > --

[jira] [Updated] (PARQUET-751) DictionaryFilter patch broke column projection

2018-04-21 Thread Gabor Szadovszky (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky updated PARQUET-751: - Fix Version/s: 1.8.2 > DictionaryFilter patch broke column projection > -

[jira] [Updated] (PARQUET-99) Large rows cause unnecessary OOM exceptions

2018-04-21 Thread Gabor Szadovszky (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-99?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky updated PARQUET-99: Fix Version/s: 1.8.2 > Large rows cause unnecessary OOM exceptions > ---

[jira] [Updated] (PARQUET-421) Fix mismatch of javadoc names and method parameters in module encoding, column, and hadoop

2018-04-21 Thread Gabor Szadovszky (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky updated PARQUET-421: - Fix Version/s: 1.8.2 > Fix mismatch of javadoc names and method parameters in module enco

[jira] [Updated] (PARQUET-340) totalMemoryPool is truncated to 32 bits

2018-04-21 Thread Gabor Szadovszky (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky updated PARQUET-340: - Fix Version/s: 1.8.2 > totalMemoryPool is truncated to 32 bits >

[jira] [Updated] (PARQUET-341) Improve write performance with wide schema sparse data

2018-04-21 Thread Gabor Szadovszky (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky updated PARQUET-341: - Fix Version/s: 1.8.2 > Improve write performance with wide schema sparse data > -

[jira] [Updated] (PARQUET-305) Logger instantiated for package org.apache.parquet may be GC-ed

2018-04-21 Thread Gabor Szadovszky (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky updated PARQUET-305: - Fix Version/s: 1.8.2 > Logger instantiated for package org.apache.parquet may be GC-ed >

[jira] [Updated] (PARQUET-349) VersionParser does not handle versions like "parquet-mr 1.6.0rc4"

2018-04-21 Thread Gabor Szadovszky (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky updated PARQUET-349: - Fix Version/s: 1.8.2 > VersionParser does not handle versions like "parquet-mr 1.6.0rc4"

[jira] [Updated] (PARQUET-654) Make record-level filtering optional

2018-04-21 Thread Gabor Szadovszky (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky updated PARQUET-654: - Fix Version/s: 1.8.2 > Make record-level filtering optional > ---

[jira] [Updated] (PARQUET-415) ByteBufferBackedBinary serialization is broken

2018-04-21 Thread Gabor Szadovszky (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky updated PARQUET-415: - Fix Version/s: 1.8.2 > ByteBufferBackedBinary serialization is broken > -

[jira] [Updated] (PARQUET-528) Fix flush() for RecordConsumer and implementations

2018-04-21 Thread Gabor Szadovszky (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky updated PARQUET-528: - Fix Version/s: 1.8.2 > Fix flush() for RecordConsumer and implementations > -

[jira] [Updated] (PARQUET-484) Warn when Decimal is stored as INT64 while could be stored as INT32

2018-04-21 Thread Gabor Szadovszky (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky updated PARQUET-484: - Fix Version/s: 1.8.2 > Warn when Decimal is stored as INT64 while could be stored as INT3

[jira] [Updated] (PARQUET-685) Deprecated ParquetInputSplit constructor passes parameters in the wrong order.

2018-04-21 Thread Gabor Szadovszky (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky updated PARQUET-685: - Fix Version/s: 1.8.2 > Deprecated ParquetInputSplit constructor passes parameters in the

[jira] [Updated] (PARQUET-569) ParquetMetadataConverter offset filter is broken

2018-04-21 Thread Gabor Szadovszky (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-569?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky updated PARQUET-569: - Fix Version/s: 1.8.2 > ParquetMetadataConverter offset filter is broken > ---

[jira] [Updated] (PARQUET-674) Add an abstraction to get the length of a stream

2018-04-21 Thread Gabor Szadovszky (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky updated PARQUET-674: - Fix Version/s: 1.8.2 > Add an abstraction to get the length of a stream > ---

[jira] [Updated] (PARQUET-548) Add Java metadata for PageEncodingStats

2018-04-21 Thread Gabor Szadovszky (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky updated PARQUET-548: - Fix Version/s: 1.8.2 > Add Java metadata for PageEncodingStats >

[jira] [Updated] (PARQUET-432) Complete a todo for method ColumnDescriptor.compareTo()

2018-04-21 Thread Gabor Szadovszky (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky updated PARQUET-432: - Fix Version/s: 1.8.2 > Complete a todo for method ColumnDescriptor.compareTo() >

[jira] [Updated] (PARQUET-318) Remove unnecessary objectmapper from ParquetMetadata

2018-04-21 Thread Gabor Szadovszky (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky updated PARQUET-318: - Fix Version/s: 1.8.2 > Remove unnecessary objectmapper from ParquetMetadata > ---

[jira] [Updated] (PARQUET-431) Make ParquetOutputFormat.memoryManager volatile

2018-04-21 Thread Gabor Szadovszky (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-431?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky updated PARQUET-431: - Fix Version/s: 1.8.2 > Make ParquetOutputFormat.memoryManager volatile >

[jira] [Updated] (PARQUET-560) Incorrect synchronization in SnappyCompressor

2018-04-21 Thread Gabor Szadovszky (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky updated PARQUET-560: - Fix Version/s: 1.8.2 > Incorrect synchronization in SnappyCompressor > --

[jira] [Updated] (PARQUET-585) Slowly ramp up sizes of int[]s in IntList to keep sizes small when data sets are small

2018-04-21 Thread Gabor Szadovszky (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky updated PARQUET-585: - Fix Version/s: 1.8.2 > Slowly ramp up sizes of int[]s in IntList to keep sizes small when

[jira] [Updated] (PARQUET-384) Add Dictionary Based Filtering to Filter2 API

2018-04-21 Thread Gabor Szadovszky (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky updated PARQUET-384: - Fix Version/s: 1.8.2 > Add Dictionary Based Filtering to Filter2 API > --

[jira] [Updated] (PARQUET-352) Add tags to "created by" metadata in the file footer

2018-04-21 Thread Gabor Szadovszky (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky updated PARQUET-352: - Fix Version/s: 1.8.2 > Add tags to "created by" metadata in the file footer > ---

[jira] [Updated] (PARQUET-389) Filter predicates should work with missing columns

2018-04-21 Thread Gabor Szadovszky (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-389?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky updated PARQUET-389: - Fix Version/s: 1.8.2 > Filter predicates should work with missing columns > -

[jira] [Updated] (PARQUET-612) Add compression to FileEncodingIT tests

2018-04-21 Thread Gabor Szadovszky (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky updated PARQUET-612: - Fix Version/s: 1.8.2 > Add compression to FileEncodingIT tests >

[jira] [Updated] (PARQUET-544) ParquetWriter.close() throws NullPointerException on second call, improper implementation of Closeable contract

2018-04-21 Thread Gabor Szadovszky (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky updated PARQUET-544: - Fix Version/s: 1.8.2 > ParquetWriter.close() throws NullPointerException on second call,

[jira] [Updated] (PARQUET-801) Allow UserDefinedPredicates in DictionaryFilter

2018-04-21 Thread Gabor Szadovszky (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky updated PARQUET-801: - Fix Version/s: 1.8.2 > Allow UserDefinedPredicates in DictionaryFilter >

[jira] [Updated] (PARQUET-356) Add ElephantBird section to LICENSE file

2018-04-21 Thread Gabor Szadovszky (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky updated PARQUET-356: - Fix Version/s: 1.8.2 > Add ElephantBird section to LICENSE file > ---

[jira] [Updated] (PARQUET-396) The builder for AvroParquetReader loses the record type

2018-04-21 Thread Gabor Szadovszky (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky updated PARQUET-396: - Fix Version/s: 1.8.2 > The builder for AvroParquetReader loses the record type >

[jira] [Updated] (PARQUET-241) ParquetInputFormat.getFooters() should return in the same order as what listStatus() returns

2018-04-21 Thread Gabor Szadovszky (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky updated PARQUET-241: - Fix Version/s: 1.8.2 > ParquetInputFormat.getFooters() should return in the same order as

[jira] [Updated] (PARQUET-430) Change to use Locale parameterized version of String.toUpperCase()/toLowerCase

2018-04-21 Thread Gabor Szadovszky (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky updated PARQUET-430: - Fix Version/s: 1.8.2 > Change to use Locale parameterized version of String.toUpperCase()

[jira] [Updated] (PARQUET-726) TestMemoryManager consistently fails

2018-04-21 Thread Gabor Szadovszky (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky updated PARQUET-726: - Fix Version/s: 1.8.2 > TestMemoryManager consistently fails > ---

[jira] [Updated] (PARQUET-495) Fix mismatches in Types class comments

2018-04-21 Thread Gabor Szadovszky (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky updated PARQUET-495: - Fix Version/s: 1.8.2 > Fix mismatches in Types class comments > -

[jira] [Updated] (PARQUET-743) DictionaryFilters can re-use StreamBytesInput when compressed

2018-04-21 Thread Gabor Szadovszky (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky updated PARQUET-743: - Fix Version/s: 1.8.2 > DictionaryFilters can re-use StreamBytesInput when compressed > --

[jira] [Updated] (PARQUET-353) Compressors not getting recycled while writing parquet files, causing memory leak

2018-04-21 Thread Gabor Szadovszky (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky updated PARQUET-353: - Fix Version/s: 1.8.2 > Compressors not getting recycled while writing parquet files, caus

[jira] [Updated] (PARQUET-393) release parquet-format 2.3.1

2018-04-21 Thread Gabor Szadovszky (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky updated PARQUET-393: - Fix Version/s: 1.8.2 > release parquet-format 2.3.1 > > >

[jira] [Updated] (PARQUET-651) Parquet-avro fails to decode array of record with a single field name "element" correctly

2018-04-21 Thread Gabor Szadovszky (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-651?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky updated PARQUET-651: - Fix Version/s: 1.8.2 > Parquet-avro fails to decode array of record with a single field n

[jira] [Updated] (PARQUET-364) Parquet-avro cannot decode Avro/Thrift array of primitive array (e.g. array>)

2018-04-21 Thread Gabor Szadovszky (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky updated PARQUET-364: - Fix Version/s: 1.8.2 > Parquet-avro cannot decode Avro/Thrift array of primitive array (e

[jira] [Updated] (PARQUET-355) Create Integration tests to validate statistics

2018-04-21 Thread Gabor Szadovszky (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky updated PARQUET-355: - Fix Version/s: 1.8.2 > Create Integration tests to validate statistics >

[jira] [Updated] (PARQUET-791) Predicate pushing down on missing columns should work on UserDefinedPredicate too

2018-04-21 Thread Gabor Szadovszky (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky updated PARQUET-791: - Fix Version/s: 1.8.2 > Predicate pushing down on missing columns should work on UserDefin

[jira] [Updated] (PARQUET-358) Add support for temporal logical types to AVRO/Parquet conversion

2018-04-21 Thread Gabor Szadovszky (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky updated PARQUET-358: - Fix Version/s: 1.8.2 > Add support for temporal logical types to AVRO/Parquet conversion

[jira] [Updated] (PARQUET-363) Cannot construct empty MessageType for ReadContext.requestedSchema

2018-04-21 Thread Gabor Szadovszky (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky updated PARQUET-363: - Fix Version/s: 1.8.2 > Cannot construct empty MessageType for ReadContext.requestedSchema

[jira] [Updated] (PARQUET-400) Error reading some files after PARQUET-77 bytebuffer read path

2018-04-21 Thread Gabor Szadovszky (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky updated PARQUET-400: - Fix Version/s: 1.8.2 > Error reading some files after PARQUET-77 bytebuffer read path > -

[jira] [Updated] (PARQUET-343) Caching nulls on group node to improve write performance on wide schema sparse data

2018-04-21 Thread Gabor Szadovszky (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky updated PARQUET-343: - Fix Version/s: 1.8.2 > Caching nulls on group node to improve write performance on wide s

[jira] [Updated] (PARQUET-1217) Incorrect handling of missing values in Statistics

2018-04-24 Thread Gabor Szadovszky (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky updated PARQUET-1217: -- Fix Version/s: 1.8.3 > Incorrect handling of missing values in Statistics > --

[jira] [Updated] (PARQUET-1246) Ignore float/double statistics in case of NaN

2018-04-25 Thread Gabor Szadovszky (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky updated PARQUET-1246: -- Fix Version/s: 1.8.3 > Ignore float/double statistics in case of NaN > ---

[jira] [Created] (PARQUET-1294) Update release scripts for the new Apache policy

2018-05-09 Thread Gabor Szadovszky (JIRA)
Gabor Szadovszky created PARQUET-1294: - Summary: Update release scripts for the new Apache policy Key: PARQUET-1294 URL: https://issues.apache.org/jira/browse/PARQUET-1294 Project: Parquet

[jira] [Commented] (PARQUET-1295) Parquet libraries do not follow proper semantic versioning

2018-05-10 Thread Gabor Szadovszky (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16470082#comment-16470082 ] Gabor Szadovszky commented on PARQUET-1295: --- I completely agree, [~vrozov]. Co

[jira] [Assigned] (PARQUET-1294) Update release scripts for the new Apache policy

2018-05-10 Thread Gabor Szadovszky (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky reassigned PARQUET-1294: - Assignee: Gabor Szadovszky > Update release scripts for the new Apache policy >

[jira] [Commented] (PARQUET-1295) Parquet libraries do not follow proper semantic versioning

2018-05-10 Thread Gabor Szadovszky (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16471533#comment-16471533 ] Gabor Szadovszky commented on PARQUET-1295: --- I think, you are right. We should

[jira] [Updated] (PARQUET-1211) Column indexes: read/write API

2018-05-17 Thread Gabor Szadovszky (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky updated PARQUET-1211: -- Summary: Column indexes: read/write API (was: Write column indexes: read/write API)

[jira] [Updated] (PARQUET-1201) Column indexes

2018-05-17 Thread Gabor Szadovszky (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky updated PARQUET-1201: -- Summary: Column indexes (was: Write column indexes) > Column indexes > --

[jira] [Updated] (PARQUET-1212) Column indexes: Show indexes in tools

2018-05-17 Thread Gabor Szadovszky (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky updated PARQUET-1212: -- Summary: Column indexes: Show indexes in tools (was: Write column indexes: Show index

[jira] [Updated] (PARQUET-1213) Column indexes: Limit index size

2018-05-17 Thread Gabor Szadovszky (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky updated PARQUET-1213: -- Summary: Column indexes: Limit index size (was: Write column indexes: Limit index siz

[jira] [Updated] (PARQUET-1214) Column indexes: Truncate min/max values

2018-05-17 Thread Gabor Szadovszky (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky updated PARQUET-1214: -- Summary: Column indexes: Truncate min/max values (was: Write column indexes: Truncate

[jira] [Commented] (PARQUET-1295) Parquet libraries do not follow proper semantic versioning

2018-05-18 Thread Gabor Szadovszky (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16480276#comment-16480276 ] Gabor Szadovszky commented on PARQUET-1295: --- [Here|https://github.com/apache/p

[jira] [Assigned] (PARQUET-1304) Release 1.10 contains breaking changes for Hive

2018-05-18 Thread Gabor Szadovszky (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky reassigned PARQUET-1304: - Assignee: Gabor Szadovszky > Release 1.10 contains breaking changes for Hive >

[jira] [Resolved] (PARQUET-1277) Release Parquet-mr 1.8.3

2018-05-24 Thread Gabor Szadovszky (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky resolved PARQUET-1277. --- Resolution: Fixed > Release Parquet-mr 1.8.3 > > >

[jira] [Updated] (PARQUET-1201) Column indexes

2018-05-28 Thread Gabor Szadovszky (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky updated PARQUET-1201: -- Description: Write the column indexes described in PARQUET-922. This is the first pha

[jira] [Created] (PARQUET-1310) Column indexes: Filtering

2018-05-28 Thread Gabor Szadovszky (JIRA)
Gabor Szadovszky created PARQUET-1310: - Summary: Column indexes: Filtering Key: PARQUET-1310 URL: https://issues.apache.org/jira/browse/PARQUET-1310 Project: Parquet Issue Type: Sub-task

[jira] [Resolved] (PARQUET-1212) Column indexes: Show indexes in tools

2018-05-28 Thread Gabor Szadovszky (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky resolved PARQUET-1212. --- Resolution: Fixed > Column indexes: Show indexes in tools >

[jira] [Commented] (PARQUET-1244) Documentation link to logical types broken

2018-05-29 Thread Gabor Szadovszky (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16493438#comment-16493438 ] Gabor Szadovszky commented on PARQUET-1244: --- Thanks a lot, [~nkollar]. +1 >

[jira] [Resolved] (PARQUET-1304) Release 1.10 contains breaking changes for Hive

2018-05-31 Thread Gabor Szadovszky (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky resolved PARQUET-1304. --- Resolution: Fixed Fix Version/s: 1.11.0 > Release 1.10 contains breaking cha

[jira] [Created] (PARQUET-1318) Wrong configuration is passed from Hadoop config to Parquet options

2018-06-04 Thread Gabor Szadovszky (JIRA)
Gabor Szadovszky created PARQUET-1318: - Summary: Wrong configuration is passed from Hadoop config to Parquet options Key: PARQUET-1318 URL: https://issues.apache.org/jira/browse/PARQUET-1318 Proje

[jira] [Assigned] (PARQUET-1309) Parquet Java uses incorrect stats and dictionary filter properties

2018-06-05 Thread Gabor Szadovszky (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky reassigned PARQUET-1309: - Assignee: Gabor Szadovszky > Parquet Java uses incorrect stats and dictionary

[jira] [Resolved] (PARQUET-1318) Wrong configuration is passed from Hadoop config to Parquet options

2018-06-05 Thread Gabor Szadovszky (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky resolved PARQUET-1318. --- Resolution: Duplicate > Wrong configuration is passed from Hadoop config to Parquet

[jira] [Assigned] (PARQUET-1322) Statistics is not available for DECIMAL types

2018-06-10 Thread Gabor Szadovszky (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky reassigned PARQUET-1322: - Assignee: Vlad Rozov > Statistics is not available for DECIMAL types > ---

[jira] [Commented] (PARQUET-1322) Statistics is not available for DECIMAL types

2018-06-11 Thread Gabor Szadovszky (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16507740#comment-16507740 ] Gabor Szadovszky commented on PARQUET-1322: --- [~vrozov], In 1.10.0 we introdu

[jira] [Commented] (PARQUET-1322) Statistics is not available for DECIMAL types

2018-06-11 Thread Gabor Szadovszky (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16508146#comment-16508146 ] Gabor Szadovszky commented on PARQUET-1322: --- Currently, we are not reading th

[jira] [Commented] (PARQUET-1322) Statistics is not available for DECIMAL types

2018-06-12 Thread Gabor Szadovszky (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16509456#comment-16509456 ] Gabor Szadovszky commented on PARQUET-1322: --- For the INT32/INT64 we can use t

[jira] [Updated] (PARQUET-1335) Logical type names in parquet-mr are not consistent with parquet-format

2018-06-22 Thread Gabor Szadovszky (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky updated PARQUET-1335: -- Affects Version/s: 1.11.0 > Logical type names in parquet-mr are not consistent with

[jira] [Commented] (PARQUET-1335) Logical type names in parquet-mr are not consistent with parquet-format

2018-06-22 Thread Gabor Szadovszky (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16520285#comment-16520285 ] Gabor Szadovszky commented on PARQUET-1335: --- Please, make sure it'll be pushe

[jira] [Created] (PARQUET-1337) Implement better estimate of page size for RLE+bitpacking

2018-06-24 Thread Gabor Szadovszky (JIRA)
Gabor Szadovszky created PARQUET-1337: - Summary: Implement better estimate of page size for RLE+bitpacking Key: PARQUET-1337 URL: https://issues.apache.org/jira/browse/PARQUET-1337 Project: Parquet

[jira] [Resolved] (PARQUET-1336) PrimitiveComparator should implements Serializable

2018-06-26 Thread Gabor Szadovszky (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky resolved PARQUET-1336. --- Resolution: Fixed > PrimitiveComparator should implements Serializable > -

[jira] [Created] (PARQUET-1364) Column Indexes: Invalid row indexes for pages starting with nulls

2018-07-31 Thread Gabor Szadovszky (JIRA)
Gabor Szadovszky created PARQUET-1364: - Summary: Column Indexes: Invalid row indexes for pages starting with nulls Key: PARQUET-1364 URL: https://issues.apache.org/jira/browse/PARQUET-1364 Project

[jira] [Created] (PARQUET-1365) Don't write page level statistics for v1

2018-07-31 Thread Gabor Szadovszky (JIRA)
Gabor Szadovszky created PARQUET-1365: - Summary: Don't write page level statistics for v1 Key: PARQUET-1365 URL: https://issues.apache.org/jira/browse/PARQUET-1365 Project: Parquet Issue

[jira] [Created] (PARQUET-1379) Incorrect check for ASCENDING/DESCENDING at column index write path

2018-08-15 Thread Gabor Szadovszky (JIRA)
Gabor Szadovszky created PARQUET-1379: - Summary: Incorrect check for ASCENDING/DESCENDING at column index write path Key: PARQUET-1379 URL: https://issues.apache.org/jira/browse/PARQUET-1379 Proje

[jira] [Updated] (PARQUET-1379) Incorrect check for ASCENDING/DESCENDING at column index write path

2018-08-15 Thread Gabor Szadovszky (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky updated PARQUET-1379: -- Issue Type: Sub-task (was: Bug) Parent: PARQUET-1201 > Incorrect check for A

[jira] [Created] (PARQUET-1386) Fix issues of NaN and +-0.0 in case of float/double column indexes

2018-08-17 Thread Gabor Szadovszky (JIRA)
Gabor Szadovszky created PARQUET-1386: - Summary: Fix issues of NaN and +-0.0 in case of float/double column indexes Key: PARQUET-1386 URL: https://issues.apache.org/jira/browse/PARQUET-1386 Projec

[jira] [Created] (PARQUET-1389) Improve value skipping at page synchronization

2018-08-17 Thread Gabor Szadovszky (JIRA)
Gabor Szadovszky created PARQUET-1389: - Summary: Improve value skipping at page synchronization Key: PARQUET-1389 URL: https://issues.apache.org/jira/browse/PARQUET-1389 Project: Parquet

[jira] [Resolved] (PARQUET-1310) Column indexes: Filtering

2018-08-17 Thread Gabor Szadovszky (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky resolved PARQUET-1310. --- Resolution: Fixed > Column indexes: Filtering > - > >

[jira] [Resolved] (PARQUET-1379) Incorrect check for ASCENDING/DESCENDING at column index write path

2018-08-18 Thread Gabor Szadovszky (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky resolved PARQUET-1379. --- Resolution: Not A Problem The check of ASCENDING/DESCENDING order is used when buil

[jira] [Created] (PARQUET-1399) Move parquet-mr related code from parquet-format

2018-08-22 Thread Gabor Szadovszky (JIRA)
Gabor Szadovszky created PARQUET-1399: - Summary: Move parquet-mr related code from parquet-format Key: PARQUET-1399 URL: https://issues.apache.org/jira/browse/PARQUET-1399 Project: Parquet

<    1   2   3   4   5   6   7   8   9   >