Hi,
It may be interesting to see this. Can you please create a hivecontext
(using standard AWS Spark stack on EMR 4.0) and create a table to read the
avro file and read data into a dataframe using hivecontext sql?
Please let me know if i can be of any help with this.
Regards,
Gourav
On Wed,
Hi,
I think I have the same issue mentioned here:
https://issues.apache.org/jira/browse/SPARK-8898
I tried to run the job with 1 core and it didn't hang anymore. I can live
with that for now, but any suggestions are welcome.
Erisa
On Tue, Jan 26, 2016 at 4:51 PM, Erisa Dervishi
Hi,
Are you creating RDD's using textfile option? Can you please let me know
the following:
1. Number of partitions
2. Number of files
3. Time taken to create the RDD's
Regards,
Gourav Sengupta
On Tue, Jan 26, 2016 at 1:12 PM, Gourav Sengupta
wrote:
> Hi,
>
> are
Actually now that I was taking a close look at the thread dump, it looks
like all the worker threads are in a "Waiting" condition:
sun.misc.Unsafe.park(Native Method)
java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
Hi,
are you creating RDD's out of the data?
Regards,
Gourav
On Tue, Jan 26, 2016 at 12:45 PM, aecc wrote:
> Sorry, I have not been able to solve the issue. I used speculation mode as
> workaround to this.
>
>
>
> --
> View this message in context:
>
Hi,
I kind am in your situation now while trying to read from S3.
Where you able to find a workaround in the end?
Thnx,
Erisa
On Thu, Nov 12, 2015 at 12:00 PM, aecc wrote:
> Some other stats:
>
> The number of files I have in the folder is 48.
> The number of
Sorry, I have not been able to solve the issue. I used speculation mode as
workaround to this.
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-task-hangs-infinitely-when-accessing-S3-from-AWS-tp25289p26068.html
Sent from the Apache Spark User List
Any hints?
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-task-hangs-infinitely-when-accessing-S3-from-AWS-tp25289p25365.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
Reading files directly from Amazon S3 can be frustrating especially if
you're dealing with a large number of input files, could you please
elaborate more on your use-case? Does the S3 bucket in question already
contain a large number of files?
The implementation of the * wildcard operator in S3
Some other stats:
The number of files I have in the folder is 48.
The number of partitions used when reading data is 7315.
The maximum size of a file to read is 14G
The size of the folder is around: 270G
--
View this message in context:
Any help on this? this is really blocking me and I don't find any feasible
solution yet.
Thanks.
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-task-hangs-infinitely-when-accessing-S3-from-AWS-tp25289p25327.html
Sent from the Apache Spark User List
Are you sitting behind a proxy or something? Can you look more into the
executor logs? I have a strange feeling that you are blowing the memory
(and possibly hitting GC etc).
Thanks
Best Regards
On Thu, Sep 10, 2015 at 10:05 PM, Mario Pastorelli <
mario.pastore...@teralytics.ch> wrote:
> Dear
12 matches
Mail list logo