[jira] [Commented] (PARQUET-1388) Nanosecond precision time and timestamp - parquet-mr

2018-10-04 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/PARQUET-1388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16638228#comment-16638228
 ] 

ASF GitHub Bot commented on PARQUET-1388:
-

zivanfi closed pull request #519: PARQUET-1388: Nanosecond precision time and 
timestamp - parquet-mr
URL: https://github.com/apache/parquet-mr/pull/519
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git 
a/parquet-arrow/src/main/java/org/apache/parquet/arrow/schema/SchemaConverter.java
 
b/parquet-arrow/src/main/java/org/apache/parquet/arrow/schema/SchemaConverter.java
index e02b03b5d..51057c589 100644
--- 
a/parquet-arrow/src/main/java/org/apache/parquet/arrow/schema/SchemaConverter.java
+++ 
b/parquet-arrow/src/main/java/org/apache/parquet/arrow/schema/SchemaConverter.java
@@ -23,6 +23,7 @@
 import static java.util.Optional.of;
 import static org.apache.parquet.schema.LogicalTypeAnnotation.TimeUnit.MICROS;
 import static org.apache.parquet.schema.LogicalTypeAnnotation.TimeUnit.MILLIS;
+import static org.apache.parquet.schema.LogicalTypeAnnotation.TimeUnit.NANOS;
 import static org.apache.parquet.schema.LogicalTypeAnnotation.dateType;
 import static org.apache.parquet.schema.LogicalTypeAnnotation.decimalType;
 import static org.apache.parquet.schema.LogicalTypeAnnotation.intType;
@@ -246,6 +247,8 @@ public TypeMapping visit(Time type) {
   return primitive(INT32, timeType(false, MILLIS));
 } else if (bitWidth == 64 && timeUnit == TimeUnit.MICROSECOND) {
   return primitive(INT64, timeType(false, MICROS));
+} else if (bitWidth == 64 && timeUnit == TimeUnit.NANOSECOND) {
+  return primitive(INT64, timeType(false, NANOS));
 }
 throw new UnsupportedOperationException("Unsupported type " + type);
   }
@@ -257,6 +260,8 @@ public TypeMapping visit(Timestamp type) {
   return primitive(INT64, timestampType(isUtcNormalized(type), 
MILLIS));
 } else if (timeUnit == TimeUnit.MICROSECOND) {
   return primitive(INT64, timestampType(isUtcNormalized(type), 
MICROS));
+} else if (timeUnit == TimeUnit.NANOSECOND) {
+  return primitive(INT64, timestampType(isUtcNormalized(type), NANOS));
 }
 throw new UnsupportedOperationException("Unsupported type " + type);
   }
@@ -460,6 +465,8 @@ public TypeMapping convertINT64(PrimitiveTypeName 
primitiveTypeName) throws Runt
   public Optional 
visit(LogicalTypeAnnotation.TimeLogicalTypeAnnotation timeLogicalType) {
 if (timeLogicalType.getUnit() == MICROS) {
   return of(field(new ArrowType.Time(TimeUnit.MICROSECOND, 64)));
+}  else if (timeLogicalType.getUnit() == NANOS) {
+  return of(field(new ArrowType.Time(TimeUnit.NANOSECOND, 64)));
 }
 return empty();
   }
@@ -471,6 +478,8 @@ public TypeMapping convertINT64(PrimitiveTypeName 
primitiveTypeName) throws Runt
 return of(field(new ArrowType.Timestamp(TimeUnit.MICROSECOND, 
getTimeZone(timestampLogicalType;
   case MILLIS:
 return of(field(new ArrowType.Timestamp(TimeUnit.MILLISECOND, 
getTimeZone(timestampLogicalType;
+  case NANOS:
+return of(field(new ArrowType.Timestamp(TimeUnit.NANOSECOND, 
getTimeZone(timestampLogicalType;
 }
 return empty();
   }
diff --git 
a/parquet-arrow/src/test/java/org/apache/parquet/arrow/schema/TestSchemaConverter.java
 
b/parquet-arrow/src/test/java/org/apache/parquet/arrow/schema/TestSchemaConverter.java
index 2817de263..c962b5456 100644
--- 
a/parquet-arrow/src/test/java/org/apache/parquet/arrow/schema/TestSchemaConverter.java
+++ 
b/parquet-arrow/src/test/java/org/apache/parquet/arrow/schema/TestSchemaConverter.java
@@ -21,6 +21,7 @@
 import static java.util.Arrays.asList;
 import static org.apache.parquet.schema.LogicalTypeAnnotation.TimeUnit.MICROS;
 import static org.apache.parquet.schema.LogicalTypeAnnotation.TimeUnit.MILLIS;
+import static org.apache.parquet.schema.LogicalTypeAnnotation.TimeUnit.NANOS;
 import static org.apache.parquet.schema.LogicalTypeAnnotation.timeType;
 import static org.apache.parquet.schema.LogicalTypeAnnotation.timestampType;
 import static org.apache.parquet.schema.OriginalType.DATE;
@@ -66,6 +67,7 @@
 import org.apache.parquet.arrow.schema.SchemaMapping.TypeMappingVisitor;
 import org.apache.parquet.arrow.schema.SchemaMapping.UnionTypeMapping;
 import org.apache.parquet.example.Paper;
+import org.apache.parquet.schema.LogicalTypeAnnotation;
 import org.apache.parquet.schema.MessageType;
 import org.apache.parquet.schema.Types;
 import org.junit.Assert;
@@ -93,6 +95,7 @@ 

[jira] [Commented] (PARQUET-1388) Nanosecond precision time and timestamp - parquet-mr

2018-08-31 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/PARQUET-1388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16598521#comment-16598521
 ] 

ASF GitHub Bot commented on PARQUET-1388:
-

nandorKollar opened a new pull request #519: PARQUET-1388: Nanosecond precision 
time and timestamp - parquet-mr
URL: https://github.com/apache/parquet-mr/pull/519
 
 
   This PR introduces nanosecond precision time and timestamp types to 
parquet-mr. It depends on https://github.com/apache/parquet-format/pull/102, 
should be committed once this format change is released.
   
   Nanosecond precision is introduced only into the new logical type API, the 
old version (ConvertedType in format and OriginalType in mr) doesn't have a 
corresponding enum value for nanoseconds. Therefore in the Thrift schema these 
fields will be null, only new releases can interpret nanosecond precision, 
older readers can only see the physical type.
   
   In addition to nanosecond precision, I also refactored the modules to use 
the new logical type API for internal decisions (e.g.: replaced the 
OriginalType-based switch cases, for example in type builder when checking if 
the proper annotation is present on physical type). This part (
   commit with title: Refactor modules to use the new logical type API) doesn't 
need new format release, and can be split out from this PR if needed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Nanosecond precision time and timestamp - parquet-mr
> 
>
> Key: PARQUET-1388
> URL: https://issues.apache.org/jira/browse/PARQUET-1388
> Project: Parquet
>  Issue Type: Improvement
>  Components: parquet-mr
>Reporter: Nandor Kollar
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)