[jira] [Created] (HIVE-17250) Avoid manually deploy changes to Jenkins server in testutils/ptest2/conf

2017-08-03 Thread Chao Sun (JIRA)
Chao Sun created HIVE-17250:
---

 Summary: Avoid manually deploy changes to Jenkins server in 
testutils/ptest2/conf
 Key: HIVE-17250
 URL: https://issues.apache.org/jira/browse/HIVE-17250
 Project: Hive
  Issue Type: Bug
  Components: Testing Infrastructure
Reporter: Chao Sun


At the moment it seems all changes to test properties under 
testutils/ptest2/conf have to be manually deployed to the Jenkins server to 
take effect. Ideally the Jenkins job should pick up the change in patches and 
apply them before triggering tests.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HIVE-17249) Concurrent appendPartition calls lead to data loss

2017-08-03 Thread Hans Zeller (JIRA)
Hans Zeller created HIVE-17249:
--

 Summary: Concurrent appendPartition calls lead to data loss
 Key: HIVE-17249
 URL: https://issues.apache.org/jira/browse/HIVE-17249
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 1.2.1
 Environment: Hortonworks HDP 2.4.
MySQL metastore.
Reporter: Hans Zeller


We are running into a problem with data getting lost when loading data in 
parallel into a partitioned Hive table. The data loader runs on multiple nodes 
and it dynamically creates partitions as it needs them, using the 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.appendPartition(String,
String, String) interface. We assume that if multiple processes try to create 
the same partition at the same time, only one of them succeeds while the others 
fail.

What we are seeing is that the partition gets created, but a few of the created 
files end up in the .Trash folder in HDFS. From the metastore log, we assume 
the following is happening in the threads of the metastore server:

- Thread 1: A first process tries to create a partition.
- Thread 1: The org.apache.hadoop.hive.metastore.HiveMetaStore.append_common() 
method
creates the HDFS directory.
- Thread 2: A second process tries to create the same partition.
- Thread 2: Notices that the directory already exists and skips the step of 
creating it.
- Thread 2: Update the metastore.
- Thread 2: Return success to the caller.
- Caller 2: Create a file in the partition directory and start inserting.
- Thread 1: Try to update the metastore, but this fails, since thread 2 already 
has inserted the partition. Retry the operation, but it still fails.
- Thread 1: Abort the transaction and move the HDFS directory to the trash, 
since it knows that it created the directory.
- Thread 1: Return failure to the caller.

The first caller can now continue to load data successfully, but the file it 
loads is actually already in the trash. It returns success, but the data is not 
inserted and not visible in the table.

Note that in our case, the callers that got an error continue as well - they 
ignore the error. I think they automatically create the HDFS partition 
directory when they create their output files. These processes can insert data 
successfully, the data that is lost is from the process that successfully 
created the partition, we believe.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HIVE-17248) DPP isn't triggered if static pruning is done for one of the partition columns

2017-08-03 Thread Sahil Takiar (JIRA)
Sahil Takiar created HIVE-17248:
---

 Summary: DPP isn't triggered if static pruning is done for one of 
the partition columns
 Key: HIVE-17248
 URL: https://issues.apache.org/jira/browse/HIVE-17248
 Project: Hive
  Issue Type: Bug
Reporter: Sahil Takiar


Queries such as:

{code}
EXPLAIN select count(*) from srcpart join srcpart_date on (srcpart.ds = 
srcpart_date.ds) join srcpart_hour on (srcpart.hr = srcpart_hour.hr) 
where srcpart_date.`date` = '2008-04-08' and srcpart.hr = 13
{code}

Where {{srcpart}} is the partitioned by {{ds}} and {{hr}}. DPP isn't triggered 
from for the join {{(srcpart.ds = srcpart_date.ds)}}, even though it could be. 
I'm guessing its because static pruning is triggered for the {{(srcpart.hr = 
srcpart_hour.hr)}} join condition.

Affects Hive-on-Tez and Hive-on-Spark.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HIVE-17247) HoS DPP: UDFs on the partition column side does not evaluate correctly

2017-08-03 Thread Sahil Takiar (JIRA)
Sahil Takiar created HIVE-17247:
---

 Summary: HoS DPP: UDFs on the partition column side does not 
evaluate correctly
 Key: HIVE-17247
 URL: https://issues.apache.org/jira/browse/HIVE-17247
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Sahil Takiar
Assignee: Sahil Takiar


Same problem as HIVE-12473 and HIVE-12667.

The query below (uses tables from {{spark_dynamic_partition_pruning.q}}) 
returns incorrect results:

{code}
select count(*) from srcpart join srcpart_date on (day(srcpart.ds) = 
day(srcpart_date.ds)) where srcpart_date.`date` = '2008-04-08';
{code}

It returns a value of 0 when DPP is on, when it is disabled it returns 1000



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HIVE-17246) Add having related blobstore query test

2017-08-03 Thread Taklon Stephen Wu (JIRA)
Taklon Stephen Wu created HIVE-17246:


 Summary: Add having related blobstore query test
 Key: HIVE-17246
 URL: https://issues.apache.org/jira/browse/HIVE-17246
 Project: Hive
  Issue Type: Test
Affects Versions: 2.1.1
Reporter: Taklon Stephen Wu
Assignee: Taklon Stephen Wu






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HIVE-17245) Replace MemoryEstimate interface with JOL

2017-08-03 Thread Prasanth Jayachandran (JIRA)
Prasanth Jayachandran created HIVE-17245:


 Summary: Replace MemoryEstimate interface with JOL
 Key: HIVE-17245
 URL: https://issues.apache.org/jira/browse/HIVE-17245
 Project: Hive
  Issue Type: Sub-task
Affects Versions: 3.0.0
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran


Replace MemoryEstimate interface used by hash table loading memory monitor with 
JOL for better memory estimation. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HIVE-17244) DPP isn't triggered against partition columns that are added to each other

2017-08-03 Thread Sahil Takiar (JIRA)
Sahil Takiar created HIVE-17244:
---

 Summary: DPP isn't triggered against partition columns that are 
added to each other
 Key: HIVE-17244
 URL: https://issues.apache.org/jira/browse/HIVE-17244
 Project: Hive
  Issue Type: Improvement
Reporter: Sahil Takiar


DPP doesn't get triggered for queries such as:

{code}
EXPLAIN SELECT count(*) FROM partitioned_table4 pt4 JOIN regular_table1 rt1 ON 
pt4.part_col1 + pt4.part_col2 = rt1.col1 + 1;
{code}

This affects both Hive-on-Spark and Hive-on-Tez.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HIVE-17243) Replace Hive's DataModel with Java Object Layout

2017-08-03 Thread Prasanth Jayachandran (JIRA)
Prasanth Jayachandran created HIVE-17243:


 Summary: Replace Hive's DataModel with Java Object Layout
 Key: HIVE-17243
 URL: https://issues.apache.org/jira/browse/HIVE-17243
 Project: Hive
  Issue Type: Improvement
Affects Versions: 3.0.0
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran


org.apache.hadoop.hive.ql.util.JavaDataModel is used for estimating memory of 
objects in many places. The problem with this approach is that manual 
accounting for all fields and references has to be done. This will also be 
problematic for cases where shallow object size vs deep object sizes are 
required. 
Hash table memory monitoring does accounting for size of the hash tables. The 
estimated sizes of hash tables are often very different for non-vector 
operators whereas the estimates are close to actual object size for vectorized 
operators. Also addition of fields requires manual changes to memory 
monitoring. 

Java object layout is openjdk project that can provide shallow and deep object 
sizes without using java agents. We can leverage that for much more accurate 
memory estimates.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HIVE-17242) Vectorized query execution for parquet tables on S3 fail with Timeout waiting for connection from pool exception

2017-08-03 Thread Vihang Karajgaonkar (JIRA)
Vihang Karajgaonkar created HIVE-17242:
--

 Summary: Vectorized query execution for parquet tables on S3 fail 
with Timeout waiting for connection from pool exception
 Key: HIVE-17242
 URL: https://issues.apache.org/jira/browse/HIVE-17242
 Project: Hive
  Issue Type: Bug
Reporter: Vihang Karajgaonkar
Assignee: Vihang Karajgaonkar


When I turn vectorization on with tables on S3 and using Hive-on-Spark may of 
the TPCDS queries fail due to the error "Timeout waiting for connection from 
pool exception" from S3AClient. This does not happen when I turn vectorization 
off.

Here is the exception trace I am seeing ..

{noformat}
Driver stacktrace:
at 
org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1452)
at 
org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1440)
at 
org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1439)
at 
scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
at 
org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1439)
at 
org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:811)
at 
org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:811)
at scala.Option.foreach(Option.scala:257)
at 
org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:811)
at 
org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1665)
at 
org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1620)
at 
org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1609)
at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
Caused by: java.io.IOException: java.lang.reflect.InvocationTargetException
at 
org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderCreationException(HiveIOExceptionHandlerChain.java:97)
at 
org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderCreationException(HiveIOExceptionHandlerUtil.java:57)
at 
org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.initNextRecordReader(HadoopShimsSecure.java:269)
at 
org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.(HadoopShimsSecure.java:216)
at 
org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileInputFormatShim.getRecordReader(HadoopShimsSecure.java:343)
at 
org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:681)
at org.apache.spark.rdd.HadoopRDD$$anon$1.(HadoopRDD.scala:245)
at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:208)
at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:101)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:319)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:283)
at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:319)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:283)
at org.apache.spark.rdd.UnionRDD.compute(UnionRDD.scala:105)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:319)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:283)
at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:79)
at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:47)
at org.apache.spark.scheduler.Task.run(Task.scala:86)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:274)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.GeneratedConstructorAccessor43.newInstance(Unknown 
Source)
at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at 
org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.initNextRecordReader(HadoopShimsSecure.java:255)
... 21 more
Caused by: java.lang.RuntimeException: java.io.InterruptedIOException: 
getFileStatus on 
s3a://cloudera-dev-hive-s3/vihang/tpcds_30_decimal_parquet/store_sales/ss_sold_date_sk=2452583/2b4b7a8a5573cc3a-682f12

[jira] [Created] (HIVE-17241) Change metastore classes to not use the shims

2017-08-03 Thread Alan Gates (JIRA)
Alan Gates created HIVE-17241:
-

 Summary: Change metastore classes to not use the shims
 Key: HIVE-17241
 URL: https://issues.apache.org/jira/browse/HIVE-17241
 Project: Hive
  Issue Type: Sub-task
  Components: Metastore
Reporter: Alan Gates
Assignee: Alan Gates


As part of moving the metastore into a standalone package, it will no longer 
have access to the shims.  This means we need to either copy them or access the 
underlying Hadoop operations directly.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] hive pull request #220: HIVE-17224 Moved all the JDO classes and package.jdo

2017-08-03 Thread alanfgates
GitHub user alanfgates opened a pull request:

https://github.com/apache/hive/pull/220

HIVE-17224 Moved all the JDO classes and package.jdo



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/alanfgates/hive hive17224

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/hive/pull/220.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #220


commit cd70b9ce7798251d5dd8b247bba0fb8211ebce40
Author: Alan Gates 
Date:   2017-07-27T21:27:57Z

Moved all the JDO classes and package.jdo




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] hive pull request #210: HIVE-17168 Create separate module for stand alone me...

2017-08-03 Thread alanfgates
Github user alanfgates closed the pull request at:

https://github.com/apache/hive/pull/210


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] hive pull request #211: HIVE-17167 Create metastore specific configuration t...

2017-08-03 Thread alanfgates
Github user alanfgates closed the pull request at:

https://github.com/apache/hive/pull/211


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] hive pull request #216: HIVE-17170 Move thrift generated code to stand alone...

2017-08-03 Thread alanfgates
Github user alanfgates closed the pull request at:

https://github.com/apache/hive/pull/216


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Re: Review Request 61380: HIVE-14786: Beeline displays binary column data as string instead of byte array

2017-08-03 Thread Peter Vary

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/61380/#review182109
---


Ship it!




Ship It!

- Peter Vary


On Aug. 2, 2017, 3:56 p.m., Barna Zsombor Klara wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/61380/
> ---
> 
> (Updated Aug. 2, 2017, 3:56 p.m.)
> 
> 
> Review request for hive, Marta Kuczora and Peter Vary.
> 
> 
> Bugs: HIVE-14786
> https://issues.apache.org/jira/browse/HIVE-14786
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-14786: Beeline displays binary column data as string instead of byte 
> array
> 
> 
> Diffs
> -
> 
>   beeline/src/java/org/apache/hive/beeline/BeeLineOpts.java 
> 3ebbc9af9ba1a99dfc1d0af63ba362bae5eb2df4 
>   beeline/src/java/org/apache/hive/beeline/Rows.java 
> 924b9519a64427936101a9dc4bbe1831719194e6 
>   beeline/src/main/resources/BeeLine.properties 
> 3b8e3e6e9c94d88e5b05b136012aaa0e605262f1 
>   beeline/src/test/org/apache/hive/beeline/TestBufferedRows.java 
> f3f3d3a20cfd751b544636d86ad95e8ad7a2341d 
>   
> beeline/src/test/org/apache/hive/beeline/TestIncrementalRowsWithNormalization.java
>  68da841f850d2e97bf4b89071ec6d20ce8cf5d10 
>   beeline/src/test/org/apache/hive/beeline/TestTableOutputFormat.java 
> c7d9f8095cf56df957ebe2f50ed033a09bd4e31b 
> 
> 
> Diff: https://reviews.apache.org/r/61380/diff/2/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Barna Zsombor Klara
> 
>



Re: Review Request 61380: HIVE-14786: Beeline displays binary column data as string instead of byte array

2017-08-03 Thread Peter Vary


> On Aug. 3, 2017, 11:52 a.m., Peter Vary wrote:
> > beeline/src/java/org/apache/hive/beeline/Rows.java
> > Lines 164 (patched)
> > 
> >
> > Why not calling the same o.toString() as before?
> 
> Barna Zsombor Klara wrote:
> I'm not sure where you would like to have the o.toString. Arrays don't 
> have their toString overridden so a direct call would result in nonsensical 
> output.

Discussed offline.
o.toString was used only when the numberFormat was set. Which was another error 
in the code.
If numberFormat was not set, then the HiveBaseResultSet.getString() was used, 
which used new String((byte[])value) to convert the data.
This way the backward compatiblity was kept.

Thanks Zsombor for the clarification


- Peter


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/61380/#review182090
---


On Aug. 2, 2017, 3:56 p.m., Barna Zsombor Klara wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/61380/
> ---
> 
> (Updated Aug. 2, 2017, 3:56 p.m.)
> 
> 
> Review request for hive, Marta Kuczora and Peter Vary.
> 
> 
> Bugs: HIVE-14786
> https://issues.apache.org/jira/browse/HIVE-14786
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-14786: Beeline displays binary column data as string instead of byte 
> array
> 
> 
> Diffs
> -
> 
>   beeline/src/java/org/apache/hive/beeline/BeeLineOpts.java 
> 3ebbc9af9ba1a99dfc1d0af63ba362bae5eb2df4 
>   beeline/src/java/org/apache/hive/beeline/Rows.java 
> 924b9519a64427936101a9dc4bbe1831719194e6 
>   beeline/src/main/resources/BeeLine.properties 
> 3b8e3e6e9c94d88e5b05b136012aaa0e605262f1 
>   beeline/src/test/org/apache/hive/beeline/TestBufferedRows.java 
> f3f3d3a20cfd751b544636d86ad95e8ad7a2341d 
>   
> beeline/src/test/org/apache/hive/beeline/TestIncrementalRowsWithNormalization.java
>  68da841f850d2e97bf4b89071ec6d20ce8cf5d10 
>   beeline/src/test/org/apache/hive/beeline/TestTableOutputFormat.java 
> c7d9f8095cf56df957ebe2f50ed033a09bd4e31b 
> 
> 
> Diff: https://reviews.apache.org/r/61380/diff/2/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Barna Zsombor Klara
> 
>



Re: Review Request 61380: HIVE-14786: Beeline displays binary column data as string instead of byte array

2017-08-03 Thread Barna Zsombor Klara


> On Aug. 3, 2017, 11:52 a.m., Peter Vary wrote:
> > beeline/src/java/org/apache/hive/beeline/Rows.java
> > Lines 164 (patched)
> > 
> >
> > Why not calling the same o.toString() as before?

I'm not sure where you would like to have the o.toString. Arrays don't have 
their toString overridden so a direct call would result in nonsensical output.


- Barna Zsombor


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/61380/#review182090
---


On Aug. 2, 2017, 3:56 p.m., Barna Zsombor Klara wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/61380/
> ---
> 
> (Updated Aug. 2, 2017, 3:56 p.m.)
> 
> 
> Review request for hive, Marta Kuczora and Peter Vary.
> 
> 
> Bugs: HIVE-14786
> https://issues.apache.org/jira/browse/HIVE-14786
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-14786: Beeline displays binary column data as string instead of byte 
> array
> 
> 
> Diffs
> -
> 
>   beeline/src/java/org/apache/hive/beeline/BeeLineOpts.java 
> 3ebbc9af9ba1a99dfc1d0af63ba362bae5eb2df4 
>   beeline/src/java/org/apache/hive/beeline/Rows.java 
> 924b9519a64427936101a9dc4bbe1831719194e6 
>   beeline/src/main/resources/BeeLine.properties 
> 3b8e3e6e9c94d88e5b05b136012aaa0e605262f1 
>   beeline/src/test/org/apache/hive/beeline/TestBufferedRows.java 
> f3f3d3a20cfd751b544636d86ad95e8ad7a2341d 
>   
> beeline/src/test/org/apache/hive/beeline/TestIncrementalRowsWithNormalization.java
>  68da841f850d2e97bf4b89071ec6d20ce8cf5d10 
>   beeline/src/test/org/apache/hive/beeline/TestTableOutputFormat.java 
> c7d9f8095cf56df957ebe2f50ed033a09bd4e31b 
> 
> 
> Diff: https://reviews.apache.org/r/61380/diff/2/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Barna Zsombor Klara
> 
>



Re: Review Request 61380: HIVE-14786: Beeline displays binary column data as string instead of byte array

2017-08-03 Thread Peter Vary

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/61380/#review182090
---



Thanks for the patch Zsombor!
Just a minor comment and a question.

Thanks,
Peter


beeline/src/java/org/apache/hive/beeline/BeeLineOpts.java
Lines 80 (patched)


Please add this the new parameter to the help text too:
BeeLine.properties
cmd-usage



beeline/src/java/org/apache/hive/beeline/Rows.java
Lines 164 (patched)


Why not calling the same o.toString() as before?


- Peter Vary


On Aug. 2, 2017, 3:56 p.m., Barna Zsombor Klara wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/61380/
> ---
> 
> (Updated Aug. 2, 2017, 3:56 p.m.)
> 
> 
> Review request for hive, Marta Kuczora and Peter Vary.
> 
> 
> Bugs: HIVE-14786
> https://issues.apache.org/jira/browse/HIVE-14786
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-14786: Beeline displays binary column data as string instead of byte 
> array
> 
> 
> Diffs
> -
> 
>   beeline/src/java/org/apache/hive/beeline/BeeLineOpts.java 
> 3ebbc9af9ba1a99dfc1d0af63ba362bae5eb2df4 
>   beeline/src/java/org/apache/hive/beeline/Rows.java 
> 924b9519a64427936101a9dc4bbe1831719194e6 
>   
> beeline/src/test/org/apache/hive/beeline/TestIncrementalRowsWithNormalization.java
>  68da841f850d2e97bf4b89071ec6d20ce8cf5d10 
> 
> 
> Diff: https://reviews.apache.org/r/61380/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Barna Zsombor Klara
> 
>



[jira] [Created] (HIVE-17240) function acos(2) should be null

2017-08-03 Thread Yuming Wang (JIRA)
Yuming Wang created HIVE-17240:
--

 Summary: function acos(2) should be null
 Key: HIVE-17240
 URL: https://issues.apache.org/jira/browse/HIVE-17240
 Project: Hive
  Issue Type: Bug
  Components: UDF
Affects Versions: 2.2.0, 1.2.2, 1.1.1
Reporter: Yuming Wang


{{acos(2)}} should be null, same as MySQL:
{code:sql}
hive> desc function extended acos;
OK
acos(x) - returns the arc cosine of x if -1<=x<=1 or NULL otherwise
Example:
  > SELECT acos(1) FROM src LIMIT 1;
  0
  > SELECT acos(2) FROM src LIMIT 1;
  NULL
Time taken: 0.009 seconds, Fetched: 6 row(s)
hive> select acos(2);
OK
NaN
Time taken: 0.437 seconds, Fetched: 1 row(s)
{code}

{code:sql}
mysql>  select acos(2);
+-+
| acos(2) |
+-+
|NULL |
+-+
1 row in set (0.00 sec)
{code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Re: Review Request 61379: HIVE-16294: Support snapshot for truncate table

2017-08-03 Thread Peter Vary


> On Aug. 3, 2017, 8:45 a.m., Peter Vary wrote:
> > common/src/java/org/apache/hadoop/hive/common/FileUtils.java
> > Lines 98 (patched)
> > 
> >
> > There is no straightforward api to decide if a given path is 
> > snapshottable, or not, but I was able to come up with this:
> > 
> > SnapshottableDirectoryStatus[] statuses = 
> > ((DistributedFileSystem)fs).getSnapshottableDirListing();
> > for (SnapshottableDirectoryStatus status: statuses) {
> >   if (status.getFullPath().equals(new Path(""))) {
> >   
> >   }
> > }
> > 
> > Maybe this is better than simply relying on the naming convention. We 
> > can use it like HdfsUtils.getFileId()
> > 
> > What do you think?

The API is not for public consumption :(
I think we should file a Hadoop ticket for requesting a public API, and until 
that it is ok, to use the string matching


- Peter


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/61379/#review182084
---


On Aug. 2, 2017, 2:03 p.m., Barna Zsombor Klara wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/61379/
> ---
> 
> (Updated Aug. 2, 2017, 2:03 p.m.)
> 
> 
> Review request for hive, Marta Kuczora and Peter Vary.
> 
> 
> Bugs: HIVE-16294
> https://issues.apache.org/jira/browse/HIVE-16294
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-16294: Support snapshot for truncate table
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/common/FileUtils.java 
> e8a3a7a49e31d02ba7ccb8774ea59c2cf0fea536 
>   metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 
> 6a6fd439d72fd5e24c881554c86480b0b3e19574 
> 
> 
> Diff: https://reviews.apache.org/r/61379/diff/1/
> 
> 
> Testing
> ---
> 
> Manual testing as automated testing would entail the creation of snapshots 
> using hadoop which as far as I know is not supported with the current Hive 
> testing framework.
> 
> 
> Thanks,
> 
> Barna Zsombor Klara
> 
>



Re: Review Request 61379: HIVE-16294: Support snapshot for truncate table

2017-08-03 Thread Peter Vary

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/61379/#review182084
---



Thanks for the patch Zsombor!
I agree how do you handle when there are snapshots.


common/src/java/org/apache/hadoop/hive/common/FileUtils.java
Lines 98 (patched)


There is no straightforward api to decide if a given path is snapshottable, 
or not, but I was able to come up with this:

SnapshottableDirectoryStatus[] statuses = 
((DistributedFileSystem)fs).getSnapshottableDirListing();
for (SnapshottableDirectoryStatus status: statuses) {
  if (status.getFullPath().equals(new Path(""))) {
  
  }
}

Maybe this is better than simply relying on the naming convention. We can 
use it like HdfsUtils.getFileId()

What do you think?


- Peter Vary


On Aug. 2, 2017, 2:03 p.m., Barna Zsombor Klara wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/61379/
> ---
> 
> (Updated Aug. 2, 2017, 2:03 p.m.)
> 
> 
> Review request for hive, Marta Kuczora and Peter Vary.
> 
> 
> Bugs: HIVE-16294
> https://issues.apache.org/jira/browse/HIVE-16294
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-16294: Support snapshot for truncate table
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/common/FileUtils.java 
> e8a3a7a49e31d02ba7ccb8774ea59c2cf0fea536 
>   metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 
> 6a6fd439d72fd5e24c881554c86480b0b3e19574 
> 
> 
> Diff: https://reviews.apache.org/r/61379/diff/1/
> 
> 
> Testing
> ---
> 
> Manual testing as automated testing would entail the creation of snapshots 
> using hadoop which as far as I know is not supported with the current Hive 
> testing framework.
> 
> 
> Thanks,
> 
> Barna Zsombor Klara
> 
>