>From <preetham.polupar...@couchbase.com>:

Attention is currently required from: Wail Alkowaileet.
preetham.polupar...@couchbase.com has posted comments on this change. ( 
https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/18209 )

Change subject: [ASTERIXDB-3392] Add parquet format for COPY TO
......................................................................


Patch Set 31:

(20 comments)

Commit Message:

https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/18209/comment/8ffbb309_9e6d5dc8
PS30, Line 7: WIP
> Reference the ticket. […]
Done


File asterixdb/asterix-common/src/main/resources/asx_errormsg/en.properties:

https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/18209/comment/80baa24d_ad02b22f
PS30, Line 307: Units
> unit
Done


https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/18209/comment/0ad3a013_0990bf62
PS30, Line 307: , given
> . Provided
Done


https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/18209/comment/220d2121_563e20c6
PS30, Line 308: Unsupported compression scheme
> Remove and use 1096
CompressionManager.java:73 uses 1096.


File 
asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/util/ExternalDataConstants.java:

https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/18209/comment/01c30667_6cad1d7b
PS30, Line 87: 1MB
> Something could be off here. […]
https://parquet.apache.org/docs/file-format/configurations/
Keeping RowGroup as 128MB, will do performance test to find out better estimate.


https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/18209/comment/160f0ffd_1b9fbba1
PS30, Line 315:     public static final String KEY_COMPRESSION_LZO = "lzo";
              :     public static final String KEY_COMPRESSION_LZ4_RAW = 
"lz4_raw";
              :     public static final String KEY_COMPRESSION_BROTLI = 
"brotli";
> Can you do a simple benchmark that compares both the size and write 
> throughput? If I remember correc […]
Okay, will do.


https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/18209/comment/c345a0e2_6dd2cc4d
PS30, Line 331: JSON_WRITER_SUPPORTED_COMPRESSION
> TEXTUAL_WRITER_SUPPORTED_COMPRESSION […]
Done


File 
asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/util/WriterValidationUtil.java:

https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/18209/comment/a395c3bb_a5ec9702
PS30, Line 115: validateJSONCompression
> validateTextualCompression
Done


File asterixdb/asterix-om/pom.xml:

https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/18209/comment/ab6ee8dc_fd816134
PS30, Line 163:       <dependency>
> Fix indentation
Done


File 
asterixdb/asterix-om/src/main/java/org/apache/asterix/om/pointables/printer/parquet/ParquetRecordLazyVisitor.java:

https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/18209/comment/d5868b26_6c2f3f9a
PS30, Line 55: ARecordType
> This will fail if it is ANY. Because ANY isn't of type ARecordType […]
Done


https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/18209/comment/8f721108_bd26d294
PS30, Line 82: throw new 
HyracksDataException(ErrorCode.TUPLE_DOES_NOT_AGREE_WITH_GIVEN_SCHEMA
> You cannot assume any exception is a typing error. […]
Done


https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/18209/comment/d5c990d3_5ef22b95
PS30, Line 90: asGroupType
> See the comment above
Done


File 
asterixdb/asterix-om/src/main/java/org/apache/asterix/om/pointables/printer/parquet/ParquetRecordVisitorUtils.java:

https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/18209/comment/3f459593_db39ecbb
PS30, Line 60:             case BOOLEAN:
             :             case BINARY:
             :             case FIXED_LEN_BYTE_ARRAY:
             :             case INT96:
> remove. 'default' suffices.
Done


https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/18209/comment/0e98711c_f07014be
PS30, Line 65: HyracksDataException
> throw RuntimeDataException.create(ErrorCode. […]
Done


https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/18209/comment/0826b084_94724681
PS30, Line 176:            switch (primitiveTypeName) {
              :                     case BOOLEAN:
              :                         recordConsumer.addBoolean(booleanValue);
              :                         break;
              :                     case BINARY:
              :                     case INT32:
              :                     case INT64:
              :                     case FLOAT:
              :                     case DOUBLE:
              :                     case FIXED_LEN_BYTE_ARRAY:
              :                     case INT96:
              :                     default:
              :                         throw new HyracksDataException(
              :                                 "Typecast impossible from " + 
typeTag + " to " + primitiveTypeName);
              :                 }
> Replace with if. […]
Done


https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/18209/comment/ca33f17c_ca5ac58c
PS30, Line 200:           case BINARY:
              :             case UUID:
              :             case POINT:
              :             case DURATION:
              :             case POINT3D:
              :             case ARRAY:
              :             case MULTISET:
              :             case OBJECT:
              :             case SPARSOBJECT:
              :             case UNION:
              :             case ENUM:
              :             case TYPE:
              :             case ANY:
              :             case LINE:
              :             case POLYGON:
              :             case CIRCLE:
              :             case RECTANGLE:
              :             case INTERVAL:
              :             case SYSTEM_NULL:
              :             case YEARMONTHDURATION:
              :             case DAYTIMEDURATION:
              :             case SHORTWITHOUTTYPEINFO:
> remove
Done


https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/18209/comment/f4465257_d2387f7b
PS30, Line 222: NULL
> We should be able to handle NULL if the schema said the value is optional.
Done


https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/18209/comment/0306381d_024e1212
PS30, Line 223:             case GEOMETRY:
              :             case UINT8:
              :             case UINT16:
              :             case UINT32:
              :             case UINT64:
              :             case BITARRAY:
> remove
Done


https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/18209/comment/8ba764b2_10e4ae5a
PS30, Line 229: MISSING
> Could treated as NULL
Done


https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/18209/comment/b87396b3_d8e4808b
PS30, Line 230:     case DATETIME:
> remove
Done



--
To view, visit https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/18209
To unsubscribe, or for help writing mail filters, visit 
https://asterix-gerrit.ics.uci.edu/settings

Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Change-Id: I40dc16969e66af09cde04b460f441af666b39d51
Gerrit-Change-Number: 18209
Gerrit-PatchSet: 31
Gerrit-Owner: preetham.polupar...@couchbase.com
Gerrit-Reviewer: Anon. E. Moose #1000171
Gerrit-Reviewer: Jenkins <jenk...@fulliautomatix.ics.uci.edu>
Gerrit-Reviewer: preetham.polupar...@couchbase.com
Gerrit-CC: Wail Alkowaileet <wael....@gmail.com>
Gerrit-Attention: Wail Alkowaileet <wael....@gmail.com>
Gerrit-Comment-Date: Tue, 30 Apr 2024 12:47:47 +0000
Gerrit-HasComments: Yes
Gerrit-Has-Labels: No
Comment-In-Reply-To: Wail Alkowaileet <wael....@gmail.com>
Gerrit-MessageType: comment

Reply via email to