Re: Spark task hangs infinitely when accessing S3 from AWS

2016-01-27 Thread Gourav Sengupta
Hi, It may be interesting to see this. Can you please create a hivecontext (using standard AWS Spark stack on EMR 4.0) and create a table to read the avro file and read data into a dataframe using hivecontext sql? Please let me know if i can be of any help with this. Regards, Gourav On Wed,

Re: Spark task hangs infinitely when accessing S3 from AWS

2016-01-27 Thread Erisa Dervishi
Hi, I think I have the same issue mentioned here: https://issues.apache.org/jira/browse/SPARK-8898 I tried to run the job with 1 core and it didn't hang anymore. I can live with that for now, but any suggestions are welcome. Erisa On Tue, Jan 26, 2016 at 4:51 PM, Erisa Dervishi

Re: Spark task hangs infinitely when accessing S3 from AWS

2016-01-26 Thread Gourav Sengupta
Hi, Are you creating RDD's using textfile option? Can you please let me know the following: 1. Number of partitions 2. Number of files 3. Time taken to create the RDD's Regards, Gourav Sengupta On Tue, Jan 26, 2016 at 1:12 PM, Gourav Sengupta wrote: > Hi, > > are

Re: Spark task hangs infinitely when accessing S3 from AWS

2016-01-26 Thread Erisa Dervishi
Actually now that I was taking a close look at the thread dump, it looks like all the worker threads are in a "Waiting" condition: sun.misc.Unsafe.park(Native Method) java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)

Re: Spark task hangs infinitely when accessing S3 from AWS

2016-01-26 Thread Gourav Sengupta
Hi, are you creating RDD's out of the data? Regards, Gourav On Tue, Jan 26, 2016 at 12:45 PM, aecc wrote: > Sorry, I have not been able to solve the issue. I used speculation mode as > workaround to this. > > > > -- > View this message in context: >

Re: Spark task hangs infinitely when accessing S3 from AWS

2016-01-26 Thread Erisa Dervishi
Hi, I kind am in your situation now while trying to read from S3. Where you able to find a workaround in the end? Thnx, Erisa On Thu, Nov 12, 2015 at 12:00 PM, aecc wrote: > Some other stats: > > The number of files I have in the folder is 48. > The number of

Re: Spark task hangs infinitely when accessing S3 from AWS

2016-01-26 Thread aecc
Sorry, I have not been able to solve the issue. I used speculation mode as workaround to this. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-task-hangs-infinitely-when-accessing-S3-from-AWS-tp25289p26068.html Sent from the Apache Spark User List

Re: Spark task hangs infinitely when accessing S3 from AWS

2015-11-12 Thread aecc
Any hints? -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-task-hangs-infinitely-when-accessing-S3-from-AWS-tp25289p25365.html Sent from the Apache Spark User List mailing list archive at Nabble.com.

Re: Spark task hangs infinitely when accessing S3 from AWS

2015-11-12 Thread Michael Cutler
Reading files directly from Amazon S3 can be frustrating especially if you're dealing with a large number of input files, could you please elaborate more on your use-case? Does the S3 bucket in question already contain a large number of files? The implementation of the * wildcard operator in S3

Re: Spark task hangs infinitely when accessing S3 from AWS

2015-11-12 Thread aecc
Some other stats: The number of files I have in the folder is 48. The number of partitions used when reading data is 7315. The maximum size of a file to read is 14G The size of the folder is around: 270G -- View this message in context:

Re: Spark task hangs infinitely when accessing S3 from AWS

2015-11-09 Thread aecc
Any help on this? this is really blocking me and I don't find any feasible solution yet. Thanks. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-task-hangs-infinitely-when-accessing-S3-from-AWS-tp25289p25327.html Sent from the Apache Spark User List

Re: Spark task hangs infinitely when accessing S3

2015-09-14 Thread Akhil Das
Are you sitting behind a proxy or something? Can you look more into the executor logs? I have a strange feeling that you are blowing the memory (and possibly hitting GC etc). Thanks Best Regards On Thu, Sep 10, 2015 at 10:05 PM, Mario Pastorelli < mario.pastore...@teralytics.ch> wrote: > Dear