<mailto:user@hive.apache.org>
Subject: RE: Hive Stored Textfile to Stored ORC taking long time
Hi Jorn
Yes I will do that test. Same file size but with less columns.
I created a table with simple columns (all strings) and not nested and I do not
do any transformations. Attach both tables
|
+--+--+
1 row selected (82.071 seconds)
From: Jörn Franke [mailto:jornfra...@gmail.com]
Sent: 09 December 2016 10:22
To: user@hive.apache.org<mailto:user@hive.apache.org>
Subject: Re: Hive Stored Textfile to Stored ORC taking long time
Ok.
No do no split in smaller files. This is
2.071 seconds)
From: Jörn Franke [mailto:jornfra...@gmail.com]
Sent: 09 December 2016 10:22
To: user@hive.apache.org
Subject: Re: Hive Stored Textfile to Stored ORC taking long time
Ok.
No do no split in smaller files. This is done automatically. Your behavior
looks strange. For that file size I w
in the future.
>
> From: Jörn Franke [mailto:jornfra...@gmail.com]
> Sent: 09 December 2016 09:48
> To: user@hive.apache.org
> Subject: Re: Hive Stored Textfile to Stored ORC taking long time
>
> How large is the file? Might IO be an issue? How many disks have yo
maller files with FLUME but this I will do
it in the future.
From: Jörn Franke [mailto:jornfra...@gmail.com]
Sent: 09 December 2016 09:48
To: user@hive.apache.org
Subject: Re: Hive Stored Textfile to Stored ORC taking long time
How large is the file? Might IO be an issue? How many disks have you
How large is the file? Might IO be an issue? How many disks have you on the
only node?
Do you compress the ORC (snappy?).
What is the Hadoop distribution? Configuration baseline? Hive version?
Not sure if i understood your setup, but might network be an issue?
> On 9 Dec 2016, at 02:08, Joaqu
Gopal
Vijayaraghavan
Sent: 09 December 2016 04:17
To: user@hive.apache.org
Subject: Re: Hive Stored Textfile to Stored ORC taking long time
> I have spark with only one worker (same for HDFS) so running now a standalone
> server but with 25G and 14 cores on that worker.
Which version o
Did you do anything to mitigate this issue? Like putting it directly on the
HDFS? Or thourg spark instead of going through Hive?
From: Qiuzhuang Lian [mailto:qiuzhuang.l...@gmail.com]
Sent: 09 December 2016 04:02
To: user@hive.apache.org
Subject: Re: Hive Stored Textfile to Stored ORC taking
> I have spark with only one worker (same for HDFS) so running now a standalone
> server but with 25G and 14 cores on that worker.
Which version of Hive was this?
And was the input text file compressed with something like gzip?
Cheers,
Gopal
Yes, we did run into this issue too. Typically if the text hive table
exceeds 100 million when converting txt table into ORC table.
On Fri, Dec 9, 2016 at 9:08 AM, Joaquin Alzola
wrote:
> HI List
>
>
>
> The transformation from textfile table to stored ORC table takes quiet a
> long time.
>
>
>
HI List
The transformation from textfile table to stored ORC table takes quiet a long
time.
Steps follow>
1.Create one normal table using textFile format
2.Load the data normally into this table
3.Create one table with the schema of the expected results of your normal hive
table using store
11 matches
Mail list logo