Review Request 70372: HIVE-21427: Syslog storage handler

2019-04-02 Thread j . prasanth . j

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70372/
---

Review request for hive, Ashutosh Chauhan and Jason Dere.


Bugs: HIVE-21427
https://issues.apache.org/jira/browse/HIVE-21427


Repository: hive-git


Description
---

HIVE-21427: Syslog storage handler


Diffs
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 
777f8b51215523fca8e396ddf77139420666311a 
  data/files/syslog-hs2-2.log PRE-CREATION 
  data/files/syslog-hs2.log PRE-CREATION 
  itests/src/test/resources/testconfiguration.properties 
96dfbc4b56b6eb3dff6b8e1e42a2371d090426e7 
  
llap-server/src/java/org/apache/hadoop/hive/llap/io/api/impl/LlapRecordReader.java
 9ef7af4eb0c9787a33d2aa4c9a4528b8f356106b 
  ql/src/java/org/apache/hadoop/hive/ql/io/sarg/ConvertAstToSearchArg.java 
27fe828b7531584138cd002956a9fcc20f238f71 
  ql/src/java/org/apache/hadoop/hive/ql/log/syslog/SyslogInputFormat.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/log/syslog/SyslogParser.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/log/syslog/SyslogSerDe.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/log/syslog/SyslogStorageHandler.java 
PRE-CREATION 
  ql/src/test/org/apache/hadoop/hive/ql/log/TestSyslogInputFormat.java 
PRE-CREATION 
  ql/src/test/queries/clientpositive/syslog_parser.q PRE-CREATION 
  ql/src/test/queries/clientpositive/syslog_parser_file_pruning.q PRE-CREATION 
  ql/src/test/results/clientpositive/llap/syslog_parser.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/llap/syslog_parser_file_pruning.q.out 
PRE-CREATION 
  
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/tools/schematool/MetastoreSchemaTool.java
 eafe0c6d46d448bce287e61fabac0384b12b9295 
  
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/tools/schematool/SchemaToolCommandLine.java
 6282078411c4c728beed8e957aa857ed3c02133c 
  
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/tools/schematool/SchemaToolTaskCreateLogsTable.java
 PRE-CREATION 


Diff: https://reviews.apache.org/r/70372/diff/1/


Testing
---


Thanks,

Prasanth_J



Re: HIVE-18624 SQL parser performance bug

2019-04-02 Thread Zoltan Haindrich

Hello Julian!

Around the time I've submitted that patch I've only pushed it to branch-2 and 
branch-3.
I wasn't thinking about also putting it on branch-2.3 and branch-3.1.. if I 
would have been; it would been already out...
I've just pushed it to branch-2.3 and branch-3.1 so if there will be a patch 
release - it will contain this fix.

cheers,
Zoltan

On 4/1/19 10:22 PM, Julian Hyde wrote:

HIVE-18624 [1] is a serious performance bug in the SQL parser. It causes parse 
times that are literally exponential in the number of parentheses in the 
expression, thus parsing of a query that has complex expressions may take 
minutes or not terminate. According to JIRA, the bug was fixed on 2.4.0, 3.1.0, 
4.0.0 code lines in August but has not yet been released.

I work for Looker, a BI tool that generates SQL with deeply nested expressions, 
and therefore they hit this bug. Hive 2.2, 2.3 and 3.0 are unusable for our 
customers due to this bug.

I do not know the schedule for 2.4.0, 3.1.0 or 4.0.0 releases, but if they are 
a way off, would it be possible to fix this bug in a patch release?

Julian

[1] https://issues.apache.org/jira/browse/HIVE-18624 





[jira] [Created] (HIVE-21565) Utilities::isEmptyPath should throw back FNFE instead of returning true

2019-04-02 Thread Rajesh Balamohan (JIRA)
Rajesh Balamohan created HIVE-21565:
---

 Summary: Utilities::isEmptyPath should throw back FNFE instead of 
returning true
 Key: HIVE-21565
 URL: https://issues.apache.org/jira/browse/HIVE-21565
 Project: Hive
  Issue Type: Bug
Reporter: Rajesh Balamohan


In case there is a {{viewfs}} configured and it ends up throwing FNFE, current 
codepath silently ignores the error and ends up creating an empty file. 


{noformat}
at org.apache.hadoop.fs.viewfs.InodeTree.resolve(InodeTree.java:403)
at 
org.apache.hadoop.fs.viewfs.ViewFileSystem.listStatus(ViewFileSystem.java:374)
at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1497)
at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1537)
at 
org.apache.hadoop.hive.ql.exec.Utilities.isEmptyPath(Utilities.java:2350)
at 
org.apache.hadoop.hive.ql.exec.Utilities.isEmptyPath(Utilities.java:2343)
at 
org.apache.hadoop.hive.ql.exec.Utilities$GetInputPathsCallable.call(Utilities.java:3128)
at 
org.apache.hadoop.hive.ql.exec.Utilities.getInputPaths(Utilities.java:3092)
at 
org.apache.hadoop.hive.ql.exec.spark.SparkPlanGenerator.cloneJobConf(SparkPlanGenerator.java:303)
at 
org.apache.hadoop.hive.ql.exec.spark.SparkPlanGenerator.generate(SparkPlanGenerator.java:226)
at 
org.apache.hadoop.hive.ql.exec.spark.SparkPlanGenerator.generate(SparkPlanGenerator.java:109)
at 
org.apache.hadoop.hive.ql.exec.spark.RemoteHiveSparkClient$JobStatusJob.call(RemoteHiveSparkClient.java:346)
at 
org.apache.hive.spark.client.RemoteDriver$JobWrapper.call(RemoteDriver.java:358)
at 
org.apache.hive.spark.client.RemoteDriver$JobWrapper.call(RemoteDriver.java:323)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Introduce FORMAT clause to CAST with SQL:2016 datetime patterns

2019-04-02 Thread Gabor Kaszab
Thanks for the feedback!
As I haven't received any comments recently and I hope I have addresses the
previous ones I'll advance to the next step and open the related jiras for
both Spark and Hive.

Cheers,
Gabor


On Thu, Mar 21, 2019 at 12:00 PM Gabor Kaszab 
wrote:

> Thanks for the quick feedbacks, Maciej and Shawn!
>
> Maciej:
> The concern about confusing users with supporting multiple datetime
> patterns is a valid one. The cleanest way to introduce SQL:2016 patterns
> would be to drop the existing pattern support (SimpleDateFormat in case of
> Impala) and replace it with the new approach. This however, would break
> backwards compatibility and would break existing user workflows that use
> the old pattern. So in order to introduce the patterns from the standard
> (to be in sync with RDBMS like Oracle, Postgre and so on) I see the only
> way is to have both approaches next to each other. To reduce user confusion
> I think we should put emphasis on the docs to have a good coverage on this
> topic and clarify in which scenario which pattern is used.
>
> Cheers,
> Gabor
>
>
>
>
> On Wed, Mar 20, 2019 at 9:37 PM Shawn Weeks 
> wrote:
>
>> I’ve done some work on a to timestamp function for hive and one of the
>> things I keep running into is most date time libraries don’t support
>> fractional seconds for their format patterns yet most rdbms do support
>> fractional seconds. It tends to trip things up when your porting sql over.
>> If we’re going the cast with format way everywhere I’d like it to support
>> that
>>
>> Thanks
>> Shawn Weeks
>>
>> Sent from my iPhone
>>
>> > On Mar 20, 2019, at 4:53 AM, Gabor Kaszab 
>> wrote:
>> >
>> > Hey Hive and Spark communities,
>> > [dev@impala in cc]
>> >
>> > I'm working on an Impala improvement to introduce the FORMAT clause
>> within
>> > CAST() operator and to implement ISO SQL:2016 datetime pattern support
>> for
>> > this new FORMAT clause:
>> > https://issues.apache.org/jira/browse/IMPALA-4018
>> >
>> > One example of the new format:
>> > SELECT(CAST("2018-01-02 09:15" as timestamp FORMAT "-MM-DD
>> HH12:MI"));
>> >
>> > I have put together a document for my proposal of how to do this in
>> Impala
>> > and what patterns we plan to support to cover the SQL standard and what
>> > additional patterns we propose to support on top of the standard's
>> > recommendation.
>> >
>> https://docs.google.com/document/d/1V7k6-lrPGW7_uhqM-FhKl3QsxwCRy69v2KIxPsGjc1k/
>> >
>> > The reason I share this with the Hive and Spark communities because I
>> feel
>> > it would be nice that these systems were in line with the Impala
>> > implementation. So I'd like to involve these communities to the planning
>> > phase of this task so that everyone can share their opinion about
>> whether
>> > this make sense in the proposed form.
>> > Eventually I feel that each of these systems should have the SQL:2016
>> > datetime format and I think it would be nice to have it with a newly
>> > introduced CAST(..FORMAT..) clause.
>> >
>> > I would like to ask members from both Hive and Spark to take a look at
>> my
>> > proposal and share their opinion from their own component's
>> perspective. If
>> > we get on the same page I'll eventually open Jiras to cover this
>> > improvement for each mentioned systems.
>> >
>> > Cheers,
>> > Gabor
>>
>