on Dataset is
causing OOM issues with same execution parameters.
Thanks
--
Shivam Sharma
Indian Institute Of Information Technology, Design and Manufacturing
Jabalpur
Email:- 28shivamsha...@gmail.com
LinkedIn:-*https://www.linkedin.com/in/28shivamsharma
<https://www.linkedin.com/in/28shivamsharma>*
Hi all,
I just need to know how spark decide how many partitions should be created
while reading a table from hive.
Thanks
--
Shivam Sharma
Indian Institute Of Information Technology, Design and Manufacturing
Jabalpur
Email:- 28shivamsha...@gmail.com
LinkedIn:-*https://www.linkedin.com
size of files in Table partition(*date='2019-05-14'*), max
file size is *1.1 GB* and I have given *7GB* to each executor so if I am
right above then it should not throw OOM.
3. And when I have put the* LIMIT 10* then does spark-hive reads all files?
Thanks
--
Shivam Sharma
Indian Institute O
ble partition(*date='2019-05-14'*), max
file size is *1.1 GB* and I have given *7GB* to each executor so if I am
right above then it should not throw OOM.
3. And when I have put the* LIMIT 10* then does spark-hive reads all files?
Thanks
--
Shivam Sharma
Indian Institute Of Information Techn
ck size"
>
> Arnaud
>
> On Mon, Jan 21, 2019 at 9:01 AM Shivam Sharma <28shivamsha...@gmail.com>
> wrote:
>
>> Don't we have any property for it?
>>
>> One more quick question that if files created by Spark is less than HDFS
>> block size then the r
Don't we have any property for it?
One more quick question that if files created by Spark is less than HDFS
block size then the rest of Block space will become unavailable and remain
unutilized or it will be shared with other files?
On Mon, Jan 21, 2019 at 1:30 PM Shivam Sharma <28shivam
spark
to persist according to HDFS Blocks.
We have something like this HIVE which solves this problem:
set hive.merge.sparkfiles=true;
set hive.merge.smallfiles.avgsize=204800;
set hive.merge.size.per.task=409600;
Thanks
--
Shivam Sharma
Indian Institute Of Information Technology
heduler.Task.run(Task.scala:89)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
.
Thanks
--
Shivam Sharma
.
Thanks
--
Shivam Sharma
Did anybody get above mail?
Thanks
On Fri, Feb 10, 2017 at 11:51 AM, Shivam Sharma <28shivamsha...@gmail.com>
wrote:
> Hi,
>
> I have multiple hive configurations(hive-site.xml) and because of that
> only I am not able to add any hive configuration in spark *conf* director
.
Thanks
--
Shivam Sharma
12 matches
Mail list logo